]> Pileus Git - ~andy/linux/log
~andy/linux
12 years agoperf tools: Fix ABI compatibility bug in print_event_desc()
Stephane Eranian [Thu, 9 Feb 2012 22:21:07 +0000 (23:21 +0100)]
perf tools: Fix ABI compatibility bug in print_event_desc()

This patches cleans up local variable types for msz and ret.
They need to be size_t and ssize_t respectively.

It also fixes a bug whereby perf would not read attr struct
with a different size than what it knows about.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-18-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf tools: Enable reading of perf.data files from different ABI rev
Stephane Eranian [Thu, 9 Feb 2012 22:21:06 +0000 (23:21 +0100)]
perf tools: Enable reading of perf.data files from different ABI rev

This patch allows perf to process perf.data files generated
using an ABI that has a different perf_event_attr struct size,
i.e., a different ABI version.

The perf_event_attr can be extended, yet perf needs to cope with
older perf.data files. Similarly, perf must be able to cope with
a perf.data file which is using a newer version of the ABI than
what it knows about.

This patch adds read_attr(), a routine that reads a
perf_event_attr struct from a file incrementally based on its
advertised size. If the on-file struct is smaller than what perf
knows, then the extra fields are zeroed. If the on-file struct
is bigger, then perf only uses what it knows about, the rest is
skipped.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-17-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf: Add ABI reference sizes
Stephane Eranian [Thu, 9 Feb 2012 22:21:05 +0000 (23:21 +0100)]
perf: Add ABI reference sizes

This patch adds reference sizes for revision 1
and 2 of the perf_event ABI, i.e., the size of
the perf_event_attr struct.

With Rev1: config2 was added = +8 bytes
With Rev2: branch_sample_type was added = +8 bytes

Adds the definition for Rev1, Rev2.

This is useful for tools trying to decode the revision
numbers based on the size of the struct.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: ravitillo@lbl.gov
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-16-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf report: Add support for taken branch sampling
Roberto Agostino Vitillo [Thu, 9 Feb 2012 22:21:03 +0000 (23:21 +0100)]
perf report: Add support for taken branch sampling

This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.

The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.

The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.

Here is a quick example.
Here branchy is simple test program which looks as follows:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
  if (n & 1UL)
    f2();
  else
    f3();
}
int main(void)
{
  unsigned long i;

  for (i=0; i < N; i++)
   f1(i);
  return 0;
}

Here is the output captured on Nehalem, if we are
only interested in user level function calls.

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
    52.34%  [.] main                   [.] f1
    24.04%  [.] f1                     [.] f3
    23.60%  [.] f1                     [.] f2
     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
     0.01%  [k] __printf               [k] _IO_vfprintf_internal
     0.01%  [k] main                   [k] __printf

About half (52%) of the call branches captured are from main()
-> f1(). The second half (24%+23%) is split in two equal shares
between f1() -> f2(), f1() ->f3(). The output is as expected
given the code.

It should be noted, that using -b in perf record does not
eliminate information in the perf.data file. Consequently, a
typical profile can also be obtained by perf report by simply
not using its -b option.

It is possible to sort on branch related columns:

   - dso_from, symbol_from
   - dso_to, symbol_to
   - mispredict

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf record: Add support for sampling taken branch
Roberto Agostino Vitillo [Thu, 9 Feb 2012 22:21:02 +0000 (23:21 +0100)]
perf record: Add support for sampling taken branch

This patch adds a new option to enable taken branch stack
sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
of perf_events.

There is a new option to active this mode: -b.
It is possible to pass a set of filters to select the type of
branches to sample.

The following filters are available:

 - any : any type of branches
 - any_call : any function call or system call
 - any_ret : any function return or system call return
 - any_ind : any indirect branch
 - u:  only when the branch target is at the user level
 - k: only when the branch target is in the kernel
 - hv: only when the branch target is in the hypervisor

Filters can be combined by passing a comma separated list
to the option:

$ perf record -b any_call,u -e cycles:u branchy

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
Roberto Agostino Vitillo [Thu, 9 Feb 2012 22:21:01 +0000 (23:21 +0100)]
perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK

This patch adds:

 - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
 - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict)
 - build histograms on branches

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf: Add callback to flush branch_stack on context switch
Stephane Eranian [Thu, 9 Feb 2012 22:21:00 +0000 (23:21 +0100)]
perf: Add callback to flush branch_stack on context switch

With branch stack sampling, it is possible to filter by priv levels.

In system-wide mode, that means it is possible to capture only user
level branches. The builtin SW LBR filter needs to disassemble code
based on LBR captured addresses. For that, it needs to know the task
the addresses are associated with. Because of context switches, the
content of the branch stack buffer may contain addresses from
different tasks.

We need a callback on context switch to either flush the branch stack
or save it. This patch adds a new callback in struct pmu which is called
during context switches. The callback is called only when necessary.
That is when a system-wide context has, at least, one event which
uses PERF_SAMPLE_BRANCH_STACK. The callback is never called for
per-thread context.

In this version, the Intel x86 code simply flushes (resets) the LBR
on context switches (fills it with zeroes). Those zeroed branches are
then filtered out by the SW filter.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-11-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf: Disable PERF_SAMPLE_BRANCH_* when not supported
Stephane Eranian [Thu, 9 Feb 2012 22:20:59 +0000 (23:20 +0100)]
perf: Disable PERF_SAMPLE_BRANCH_* when not supported

PERF_SAMPLE_BRANCH_* is disabled for:

 - SW events (sw counters, tracepoints)
 - HW breakpoints
 - ALL but Intel x86 architecture
 - AMD64 processors

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Add LBR software filter support for Intel CPUs
Stephane Eranian [Thu, 9 Feb 2012 22:20:58 +0000 (23:20 +0100)]
perf/x86: Add LBR software filter support for Intel CPUs

This patch adds an internal sofware filter to complement
the (optional) LBR hardware filter.

The software filter is necessary:

 - as a substitute when there is no HW LBR filter (e.g., Atom, Core)
 - to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere)
 - to provide finer grain filtering (e.g., all processors)

Sometimes the LBR HW filter cannot distinguish between two types
of branches. For instance, to capture syscall as CALLS, it is necessary
to enable the LBR_FAR filter which will also capture JMP instructions.
Thus, a second pass is necessary to filter those out, this is what the
SW filter can do.

The SW filter is built on top of the internal x86 disassembler. It
is a best effort filter especially for user level code. It is subject
to the availability of the text page of the program.

The SW filter is enabled on all Intel processors. It is bypassed
when the user is capturing all branches at all priv levels.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-9-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Implement PERF_SAMPLE_BRANCH for Intel CPUs
Stephane Eranian [Thu, 9 Feb 2012 22:20:57 +0000 (23:20 +0100)]
perf/x86: Implement PERF_SAMPLE_BRANCH for Intel CPUs

This patch implements PERF_SAMPLE_BRANCH support for Intel
x86processors. It connects PERF_SAMPLE_BRANCH to the actual LBR.

The patch adds the hooks in the PMU irq handler to save the LBR
on counter overflow for both regular and PEBS modes.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-8-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Disable LBR support for older Intel Atom processors
Stephane Eranian [Thu, 9 Feb 2012 22:20:56 +0000 (23:20 +0100)]
perf/x86: Disable LBR support for older Intel Atom processors

The patch adds a restriction for Intel Atom LBR support. Only
steppings 10 (PineView) and more recent are supported. Older models
do not have a functional LBR. Their LBR does not freeze on PMU
interrupt which makes LBR unusable in the context of perf_events.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-7-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Add Intel LBR mappings for PERF_SAMPLE_BRANCH filters
Stephane Eranian [Thu, 9 Feb 2012 22:20:55 +0000 (23:20 +0100)]
perf/x86: Add Intel LBR mappings for PERF_SAMPLE_BRANCH filters

This patch adds the mappings from the generic PERF_SAMPLE_BRANCH_*
filters to the actual Intel x86LBR filters, whenever they exist.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-6-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Sync branch stack sampling with precise_sampling
Stephane Eranian [Thu, 9 Feb 2012 22:20:54 +0000 (23:20 +0100)]
perf/x86: Sync branch stack sampling with precise_sampling

If precise sampling is enabled on Intel x86 then perf_event uses PEBS.
To correct for the off-by-one error of PEBS, perf_event uses LBR when
precise_sample > 1.

On Intel x86 PERF_SAMPLE_BRANCH_STACK is implemented using LBR,
therefore both features must be coordinated as they may not
configure LBR the same way.

For PEBS, LBR needs to capture all branches at the priv level of
the associated event.

This patch checks that the branch type and priv level of BRANCH_STACK
is compatible with that of the PEBS LBR requirement, thereby allowing:

   $ perf record -b any,u -e instructions:upp ....

But:

   $ perf record -b any_call,u -e instructions:upp

Is not possible.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Add Intel LBR sharing logic
Stephane Eranian [Thu, 9 Feb 2012 22:20:53 +0000 (23:20 +0100)]
perf/x86: Add Intel LBR sharing logic

The Intel LBR on some recent processor is capable
of filtering branches by type. The filter is configurable
via the LBR_SELECT MSR register.

There are limitation on how this register can be used.

On Nehalem/Westmere, the LBR_SELECT is shared by the two HT threads
when HT is on. It is private to each core when HT is off.

On SandyBridge, the LBR_SELECT register is private to each thread
when HT is on. It is private to each core when HT is off.

The kernel must manage the sharing of LBR_SELECT. It allows
multiple users on the same logical CPU to use LBR_SELECT as
long as they program it with the same value. Across sibling
CPUs (HT threads), the same restriction applies on NHM/WSM.

This patch implements this sharing logic by leveraging the
mechanism put in place for managing the offcore_response
shared MSR.

We modify __intel_shared_reg_get_constraints() to cause
x86_get_event_constraint() to be called because LBR may
be associated with events that may be counter constrained.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/x86: Add Intel LBR MSR definitions
Stephane Eranian [Thu, 9 Feb 2012 22:20:52 +0000 (23:20 +0100)]
perf/x86: Add Intel LBR MSR definitions

This patch adds the LBR definitions for NHM/WSM/SNB and Core.
It also adds the definitions for the architected LBR MSR:
LBR_SELECT, LBRT_TOS.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf: Add generic taken branch sampling support
Stephane Eranian [Thu, 9 Feb 2012 22:20:51 +0000 (23:20 +0100)]
perf: Add generic taken branch sampling support

This patch adds the ability to sample taken branches to the
perf_event interface.

The ability to capture taken branches is very useful for all
sorts of analysis. For instance, basic block profiling, call
counts, statistical call graph.

This new capability requires hardware assist and as such may
not be available on all HW platforms. On Intel x86 it is
implemented on top of the Last Branch Record (LBR) facility.

To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK
bit must be set in attr->sample_type.

Sampled taken branches may be filtered by type and/or priv
levels.

The patch adds a new field, called branch_sample_type, to the
perf_event_attr structure. It contains a bitmask of filters
to apply to the sampled taken branches.

Filters may be implemented in HW. If the HW filter does not exist
or is not good enough, some arch may also implement a SW filter.

The following generic filters are currently defined:
- PERF_SAMPLE_USER
  only branches whose targets are at the user level

- PERF_SAMPLE_KERNEL
  only branches whose targets are at the kernel level

- PERF_SAMPLE_HV
  only branches whose targets are at the hypervisor level

- PERF_SAMPLE_ANY
  any type of branches (subject to priv levels filters)

- PERF_SAMPLE_ANY_CALL
  any call branches (may incl. syscall on some arch)

- PERF_SAMPLE_ANY_RET
  any return branches (may incl. syscall returns on some arch)

- PERF_SAMPLE_IND_CALL
  indirect call branches

Obviously filter may be combined. The priv level bits are optional.
If not provided, the priv level of the associated event are used. It
is possible to collect branches at a priv level different from the
associated event. Use of kernel, hv priv levels is subject to permissions
and availability (hv).

The number of taken branch records present in each sample may vary based
on HW, the type of sampled branches, the executed code. Therefore
each sample contains the number of taken branches it contains.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328826068-11713-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge branch 'perf/urgent' into perf/core
Ingo Molnar [Mon, 5 Mar 2012 08:20:08 +0000 (09:20 +0100)]
Merge branch 'perf/urgent' into perf/core

Conflicts:
tools/perf/builtin-record.c
tools/perf/builtin-top.c
tools/perf/perf.h
tools/perf/util/top.h

Merge reason: resolve these cherry-picking conflicts.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Mon, 5 Mar 2012 08:05:44 +0000 (09:05 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Cherry picked fixes from perf/core, together with the kernel fix (1018faa),
the sampling tools (top, record) are back working on AMD systems.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf tools: Handle kernels that don't support attr.exclude_{guest,host}
Arnaldo Carvalho de Melo [Tue, 14 Feb 2012 16:05:30 +0000 (14:05 -0200)]
perf tools: Handle kernels that don't support attr.exclude_{guest,host}

Just fall back to resetting those fields, if set, warning the user that
that feature is not available.

If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.

Reported-by: Namhyung Kim <namhyung@gmail.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Change perf_guest default back to false
Joerg Roedel [Fri, 10 Feb 2012 17:05:05 +0000 (18:05 +0100)]
perf tools: Change perf_guest default back to false

Setting perf_guest to true by default makes no sense because the perf
subcommands can not setup guest symbol information and thus not process
and guest samples. The only exception is perf-kvm which changes the
perf_guest value on its own.  So change the default for perf_guest back
to false.

Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf record: No build id option fails
David Ahern [Mon, 6 Feb 2012 22:27:52 +0000 (15:27 -0700)]
perf record: No build id option fails

A recent refactoring of perf-record introduced the following:

perf record -a -B
Couldn't generating buildids. Use --no-buildid to profile anyway.
sleep: Terminated

I believe the triple negative was meant to be only a double negative.
:-) While I'm there, fixed the grammar on the error message.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled
Joerg Roedel [Wed, 29 Feb 2012 13:57:32 +0000 (14:57 +0100)]
perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled

It turned out that a performance counter on AMD does not
count at all when the GO or HO bit is set in the control
register and SVM is disabled in EFER.

This patch works around this issue by masking out the HO bit
in the performance counter control register when SVM is not
enabled.

The GO bit is not touched because it is only set when the
user wants to count in guest-mode only. So when SVM is
disabled the counter should not run at all and the
not-counting is the intended behaviour.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Avi Kivity <avi@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: stable@vger.kernel.org # v3.2
Link: http://lkml.kernel.org/r/1330523852-19566-1-git-send-email-joerg.roedel@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Fri, 2 Mar 2012 07:40:45 +0000 (08:40 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Various smaller perf/urgent fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf probe: Ensure offset provided is not greater than function length without DWARF...
Prashanth Nageshappa [Tue, 28 Feb 2012 04:13:01 +0000 (09:43 +0530)]
perf probe: Ensure offset provided is not greater than function length without DWARF info too

The 'perf probe' command allows kprobe to be inserted at any offset from
a function start, which results in adding kprobes to unintended
location.  (example: perf probe do_fork+10000 is allowed even though
size of do_fork is ~904).

My previous patch https://lkml.org/lkml/2012/2/24/42 addressed the case
where DWARF info was available for the kernel. This patch fixes the
case where perf probe is used on a kernel without debuginfo available.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F4C544D.1010909@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Ensure comm string is properly terminated
David Ahern [Fri, 24 Feb 2012 19:31:38 +0000 (12:31 -0700)]
perf tools: Ensure comm string is properly terminated

If threads in a multi-threaded process have names shorter than the main
thread the comm for the named threads is not properly terminated.

E.g., for the process 'namedthreads' where each thread is named noploop%d
where %d is the thread number:

Before:
    perf script -f comm,tid,ip,sym,dso
    noploop:4ads 21616  400a49 noploop (/tmp/namedthreads)
The 'ads' in the thread comm bleeds over from the process name.

After:
    perf script -f comm,tid,ip,sym,dso
       noploop:4 21616  400a49 noploop (/tmp/namedthreads)

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1330111898-68071-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf probe: Ensure offset provided is not greater than function length
Prashanth Nageshappa [Fri, 24 Feb 2012 07:41:39 +0000 (13:11 +0530)]
perf probe: Ensure offset provided is not greater than function length

The perf probe command allows kprobe to be inserted at any offset from a
function start, which results in adding kprobes to unintended location.

Example: perf probe do_fork+10000 is allowed even though size of do_fork
is ~904.

This patch will ensure probe addition fails when the offset specified is
greater than size of the function.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/4F473F33.4060409@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf evlist: Return first evsel for non-sample event on old kernel
Namhyung Kim [Mon, 20 Feb 2012 01:47:26 +0000 (10:47 +0900)]
perf evlist: Return first evsel for non-sample event on old kernel

On old kernels that don't support sample_id_all feature,
perf_evlist__id2evsel() returns NULL for non-sampling events.

This breaks perf top when multiple events are given on command line. Fix
it by using first evsel in the evlist. This will also prevent getting
the same (potential) problem in such new tool/ old kernel combo.

Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329702447-25045-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agostatic keys: Inline the static_key_enabled() function
Jason Baron [Tue, 28 Feb 2012 18:49:01 +0000 (13:49 -0500)]
static keys: Inline the static_key_enabled() function

In the jump label enabled case, calling static_key_enabled()
results in a function call. The function returns the results of
a compare, so it really doesn't need the overhead of a full
function call. Let's make it 'static inline' for both the jump
label enabled and disabled cases.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: a.p.zijlstra@chello.nl
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/201202281849.q1SIn1p2023270@int-mx10.intmail.prod.int.phx2.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge branch 'perf/jump-labels' into perf/core
Ingo Molnar [Tue, 28 Feb 2012 18:59:45 +0000 (19:59 +0100)]
Merge branch 'perf/jump-labels' into perf/core

Merge reason: After much naming discussion, there seems to be consensus
              now - queue it up for v3.4.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoperf/hwbp: Fix a possible memory leak
Namhyung Kim [Tue, 28 Feb 2012 01:19:38 +0000 (10:19 +0900)]
perf/hwbp: Fix a possible memory leak

If kzalloc() for TYPE_DATA failed on a given cpu, previous chunk
of TYPE_INST will be leaked. Fix it.

Thanks to Peter Zijlstra for suggesting this better solution. It
should work as long as the initial value of the region is all
0's and that's the case of static (per-cpu) memory allocation.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1330391978-28070-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs
Linus Torvalds [Mon, 27 Feb 2012 15:59:33 +0000 (07:59 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs

Here are some trivial NTFS changes (a spelling fix and two use before
NULL check cases found by Coverity as well as an update in MAINTAINERS
for the path to the ntfs git repo) together with a simple LDM fix for
parsing fragmented VBLKs.

* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
  NTFS: Update git repo path in MAINTAINERS file.
  LDM: Fix reassembly of extended VBLKs.
  NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.
  NTFS: Do not dereference pointer before checking for NULL.
  NTFS: Remove unused variable.

12 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 27 Feb 2012 15:55:51 +0000 (07:55 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/AMD: Fix UP build error
  x86: Specify a size for the cmp in the NMI handler
  x86/nmi: Test saved %cs in NMI to determine nested NMI case
  x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors
  x86/microcode: Remove noisy AMD microcode warning

12 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 27 Feb 2012 15:55:39 +0000 (07:55 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/events: Revert trace_sched_stat_sleeptime()

12 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 27 Feb 2012 15:54:57 +0000 (07:54 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Handle pending irqs in irq_startup()
  genirq: Unmask oneshot irqs when thread was not woken

12 years agocompat: fix compile breakage on s390
Heiko Carstens [Mon, 27 Feb 2012 09:01:52 +0000 (10:01 +0100)]
compat: fix compile breakage on s390

The new is_compat_task() define for the !COMPAT case in
include/linux/compat.h conflicts with a similar define in
arch/s390/include/asm/compat.h.

This is the minimal patch which fixes the build issues.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoNTFS: Update git repo path in MAINTAINERS file.
Anton Altaparmakov [Mon, 27 Feb 2012 09:08:33 +0000 (09:08 +0000)]
NTFS: Update git repo path in MAINTAINERS file.

Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
12 years agoMerge branch 'master' of /Volumes/CaseSensitiveDisk/linux
Anton Altaparmakov [Mon, 27 Feb 2012 09:01:22 +0000 (09:01 +0000)]
Merge branch 'master' of /Volumes/CaseSensitiveDisk/linux

12 years agoMerge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Ingo Molnar [Mon, 27 Feb 2012 07:44:48 +0000 (08:44 +0100)]
Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core

12 years agoMerge tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 27 Feb 2012 05:03:16 +0000 (21:03 -0800)]
Merge tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Two fixes to fix a memory corruption bug when WC pages never get
converted back to WB but end up being recycled in the general memory
pool as WC.

There is a better way of fixing this, but there is not enough time to do
the full benchmarking to pick one of the right options - so picking the
one that favors stability for right now.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pat: Disable PAT support for now.
  xen/setup: Remove redundant filtering of PTE masks.

12 years agoMerge tag 'for-linus' of git://github.com/rustyrussell/linux
Linus Torvalds [Mon, 27 Feb 2012 05:02:07 +0000 (21:02 -0800)]
Merge tag 'for-linus' of git://github.com/rustyrussell/linux

* tag 'for-linus' of git://github.com/rustyrussell/linux:
  mod/file2alias: make modpost compile on darwin again

12 years agoautofs4 - update MAINTAINERS mailing list entry
Ian Kent [Mon, 27 Feb 2012 00:03:38 +0000 (08:03 +0800)]
autofs4 - update MAINTAINERS mailing list entry

The autofs mailing list has moved to vger.kernel.org.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomod/file2alias: make modpost compile on darwin again
Andreas BieĂźmann [Fri, 24 Feb 2012 07:23:53 +0000 (08:23 +0100)]
mod/file2alias: make modpost compile on darwin again

commit e49ce14150c64b29a8dd211df785576fa19a9858 breaks cross compiling
the linux kernel on darwin hosts.
This fix introduce some minimal glue to adopt linker section handling
for darwin hosts.

Signed-off-by: Andreas BieĂźmann <andreas@biessmann.de>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Jochen Friedrich <jochen@scram.de>
CC: Samuel Ortiz <sameo@linux.intel.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Bernhard Walle <bernhard@bwalle.de>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 26 Feb 2012 20:47:17 +0000 (12:47 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

1) ICMP sockets leave err uninitialized but we try to return it for the
   unsupported MSG_OOB case, reported by Dave Jones.

2) Add new Zaurus device ID entries, from Dave Jones.

3) Pointer calculation in hso driver memset is wrong, from Dan
   Carpenter.

4) ks8851_probe() checks unsigned value as negative, fix also from Dan
   Carpenter.

5) Fix crashes in atl1c driver due to TX queue handling, from Eric
   Dumazet.  I anticipate some TX side locking fixes coming in the near
   future for this driver as well.

6) The inline directive fix in Bluetooth which was breaking the build
   only with very new versions of GCC, from Johan Hedberg.

7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
   window, reported by Meelis Roos and fixed by Eric Dumazet.

8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng.

9) Some ip6_route_output() callers test the return value for NULL, but
   this never happens as the convention is to return a dst entry with
   dst->error set.  Fixes from RonQing Li.

10) Logitech Harmony 900 should be handled by zaurus driver not
   cdc_ether, update white lists and black lists accordingly.  From
   Scott Talbert.

11) Receiving from certain kinds of devices there won't be a MAC header,
   so there is no MAC header to fixup in the IPSEC code, and if we try
   to do it we'll crash.  Fix from Eric Dumazet.

12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
   Petrilin.

13) Fix regression in link-down handling in davinci_emac which causes
   all RX descriptors to be freed up and therefore RX to wedge
   completely, from Christian Riesch.

14) It took two attempts, but ctnetlink soft lockups seem to be
   cured now, from Pablo Neira Ayuso.

15) Endianness bug fix in ENIC driver, from Santosh Nayak.

16) The long ago conversion of the PPP fragmentation code over to
   abstracted SKB list handling wasn't perfect, once we get an
   out of sequence SKB we don't flush the rest of them like we
   should.  From Ben McKeegan.

17) Fix regression of ->ip_summed initialization in sfc driver.
   From Ben Hutchings.

18) Bluetooth timeout mistakenly using msecs instead of jiffies,
   from Andrzej Kaczmarek.

19) Using _sync variant of work cancellation results in deadlocks,
   use the non _sync variants instead.  From Andre Guedes.

20) Bluetooth rfcomm code had reference counting problems leading
   to crashes, fix from Octavian Purdila.

21) The conversion of netem over to classful qdisc handling added
   two bugs to netem_dequeue(), fixes from Eric Dumazet.

22) Missing pci_iounmap() in ATM Solos driver.  Fix from Julia Lawall.

23) b44_pci_exit() should not have __exit tag since it's invoked from
   non-__exit code.  From Nikola Pajkovsky.

24) The conversion of the neighbour hash tables over to RCU added a
   race, fixed here by adding the necessary reread of tbl->nht, fix
   from Michel Machado.

25) When we added VF (virtual function) attributes for network device
   dumps, this potentially bloats up the size of the dump of one
   network device such that the dump size is too large for the buffer
   allocated by properly written netlink applications.

   In particular, if you add 255 VFs to a network device, parts of
   GLIBC stop working.

   To fix this, we add an attribute that is used to turn on these
   extended portions of the network device dump.  Sophisticaed
   applications like 'ip' that want to see this stuff  will be changed
   to set the attribute, whereas things like GLIBC that don't care
   about VFs simply will not, and therefore won't be busted by the
   mere presence of VFs on a network device.

   Thanks to the tireless work of Greg Rose on this fix.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
  sfc: Fix assignment of ip_summed for pre-allocated skbs
  ppp: fix 'ppp_mp_reconstruct bad seq' errors
  enic: Fix endianness bug.
  gre: fix spelling in comments
  netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
  Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
  davinci_emac: Do not free all rx dma descriptors during init
  mlx4_core: Fixing array indexes when setting port types
  phy: IC+101G and PHY_HAS_INTERRUPT flag
  netdev/phy/icplus: Correct broken phy_init code
  ipsec: be careful of non existing mac headers
  Move Logitech Harmony 900 from cdc_ether to zaurus
  hso: memsetting wrong data in hso_get_count()
  netfilter: ip6_route_output() never returns NULL.
  ethernet/broadcom: ip6_route_output() never returns NULL.
  ipv6: ip6_route_output() never returns NULL.
  jme: Fix FIFO flush issue
  atm: clip: remove clip_tbl
  ipv4: ping: Fix recvmsg MSG_OOB error handling.
  rtnetlink: Fix problem with buffer allocation
  ...

12 years agoFix autofs compile without CONFIG_COMPAT
Linus Torvalds [Sun, 26 Feb 2012 17:44:55 +0000 (09:44 -0800)]
Fix autofs compile without CONFIG_COMPAT

The autofs compat handling fix caused a compile failure when
CONFIG_COMPAT isn't defined.

Instead of adding random #ifdef'fery in autofs, let's just make the
compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
just hardcodes to zero.

We could probably do something similar for a number of other cases where
we have #ifdef's in code, but this is the low-hanging fruit.

Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoLinux 3.3-rc5 v3.3-rc5
Linus Torvalds [Sat, 25 Feb 2012 20:18:16 +0000 (12:18 -0800)]
Linux 3.3-rc5

12 years agoMerge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Sat, 25 Feb 2012 20:12:08 +0000 (12:12 -0800)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Couple of minor driver fixes.

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (max34440) Fix resetting temperature history
  hwmon: (f75375s) Fix register write order when setting fans to full speed
  hwmon: (ads1015) Fix file leak in probe function
  hwmon: (max6639) Fix PPR register initialization to set both channels
  hwmon: (max6639) Fix FAN_FROM_REG calculation

12 years agoMerge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Sat, 25 Feb 2012 20:11:25 +0000 (12:11 -0800)]
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

three kbuild fixes for 3.3:
 - make deb-pkg symlink race fix.
 - make coccicheck fix.
 - Dropping the check for modutils.  This is not a regression, but
   allows the module-init-tools replacement kmod work with the 3.3
   kernel.

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  coccicheck: change handling of C={1,2} when M= is set
  builddeb: Don't create files in /tmp with predictable names
  kbuild: do not check for ancient modutils tools

12 years agoautofs: work around unhappy compat problem on x86-64
Ian Kent [Wed, 22 Feb 2012 12:45:44 +0000 (20:45 +0800)]
autofs: work around unhappy compat problem on x86-64

When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.

However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.

And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.

End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.

As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.

Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.

So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task.  That way we
will write the properly sized packet that user mode expects.

Not pretty.  Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.

Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Sat, 25 Feb 2012 04:03:14 +0000 (20:03 -0800)]
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

One InfiniBand/RDMA regression fix for 3.3:

 - mlx4 SR-IOV changes added static exported functions, which doesn't
   build on powerpc at least.  Fix from Doug Ledford for this.

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4_core: Exported functions can't be static

12 years agoMerge branch 'sfc-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc
David S. Miller [Sat, 25 Feb 2012 03:12:44 +0000 (22:12 -0500)]
Merge branch 'sfc-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc

12 years agosfc: Fix assignment of ip_summed for pre-allocated skbs
Ben Hutchings [Sat, 25 Feb 2012 00:03:10 +0000 (00:03 +0000)]
sfc: Fix assignment of ip_summed for pre-allocated skbs

When pre-allocating skbs for received packets, we set ip_summed =
CHECKSUM_UNNCESSARY.  We used to change it back to CHECKSUM_NONE when
the received packet had an incorrect checksum or unhandled protocol.

Commit bc8acf2c8c3e43fcc192762a9f964b3e9a17748b ('drivers/net: avoid
some skb->ip_summed initializations') mistakenly replaced the latter
assignment with a DEBUG-only assertion that ip_summed ==
CHECKSUM_NONE.  This assertion is always false, but it seems no-one
has exercised this code path in a DEBUG build.

Fix this by moving our assignment of CHECKSUM_UNNECESSARY into
efx_rx_packet_gro().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
12 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi...
Linus Torvalds [Sat, 25 Feb 2012 00:08:51 +0000 (16:08 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

SCSI fixes on 20120224:
 "This is a set of assorted bug fixes for power management, mpt2sas,
  ipr, the rdac device handler and quite a big chunk for qla2xxx (plus a
  use after free of scsi_host in scsi_scan.c). "

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] scsi_dh_rdac: Fix for unbalanced reference count
  [SCSI] scsi_pm: Fix bug in the SCSI power management handler
  [SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost'
  [SCSI] qla2xxx: Update version number to 8.03.07.13-k.
  [SCSI] qla2xxx: Proper detection of firmware abort error code for ISP82xx.
  [SCSI] qla2xxx: Remove resetting memory during device initialization for ISP82xx.
  [SCSI] qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle.
  [SCSI] qla2xxx: Remove check for null fcport from host reset handler.
  [SCSI] qla2xxx: Correct out of bounds read of ISP2200 mailbox registers.
  [SCSI] qla2xxx: Remove errant clearing of MBX_INTERRUPT flag during CT-IOCB processing.
  [SCSI] qla2xxx: Clear options-flags while issuing stop-firmware mbx command.
  [SCSI] qla2xxx: Add an "is reset active" helper.
  [SCSI] qla2xxx: Add check for null fcport references in qla2xxx_queuecommand.
  [SCSI] qla2xxx: Propagate up abort failures.
  [SCSI] isci: Fix NULL ptr dereference when no firmware is being loaded
  [SCSI] ipr: fix eeh recovery for 64-bit adapters
  [SCSI] mpt2sas: Fix mismatch in mpt2sas_base_hard_reset_handler() mutex lock-unlock

12 years agoppp: fix 'ppp_mp_reconstruct bad seq' errors
Ben McKeegan [Fri, 24 Feb 2012 06:33:56 +0000 (06:33 +0000)]
ppp: fix 'ppp_mp_reconstruct bad seq' errors

This patch fixes a (mostly cosmetic) bug introduced by the patch
'ppp: Use SKB queue abstraction interfaces in fragment processing'
found here: http://www.spinics.net/lists/netdev/msg153312.html

The above patch rewrote and moved the code responsible for cleaning
up discarded fragments but the new code does not catch every case
where this is necessary.  This results in some discarded fragments
remaining in the queue, and triggering a 'bad seq' error on the
subsequent call to ppp_mp_reconstruct.  Fragments are discarded
whenever other fragments of the same frame have been lost.
This can generate a lot of unwanted and misleading log messages.

This patch also adds additional detail to the debug logging to
make it clearer which fragments were lost and which other fragments
were discarded as a result of losses. (Run pppd with 'kdebug 1'
option to enable debug logging.)

Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoenic: Fix endianness bug.
Santosh Nayak [Fri, 24 Feb 2012 06:56:39 +0000 (06:56 +0000)]
enic: Fix endianness bug.

Sparse complaints the endian bug.

Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agococcicheck: change handling of C={1,2} when M= is set
Greg Dietsche [Fri, 20 Jan 2012 23:10:35 +0000 (17:10 -0600)]
coccicheck: change handling of C={1,2} when M= is set

This patch reverts a portion of d0bc1fb4 so that coccicheck will
work properly when C=1 or C=2.

Reported-and-tested-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michal Marek <mmarek@suse.cz>
12 years agoMerge branch 'master' of git://1984.lsi.us.es/net
David S. Miller [Fri, 24 Feb 2012 22:41:57 +0000 (17:41 -0500)]
Merge branch 'master' of git://1984.lsi.us.es/net

12 years agogre: fix spelling in comments
stephen hemminger [Fri, 24 Feb 2012 08:08:20 +0000 (08:08 +0000)]
gre: fix spelling in comments

The original spelling and bad word choice makes these comments hard to read.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Fri, 24 Feb 2012 20:32:51 +0000 (12:32 -0800)]
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] hdpvr: update picture controls to support firmware versions > 0.15
  [media] wl128x: fix build errors when GPIOLIB is not enabled
  [media] hdpvr: fix race conditon during start of streaming
  [media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
  [media] imon: don't wedge hardware after early callbacks

12 years agoepoll: ep_unregister_pollwait() can use the freed pwq->whead
Oleg Nesterov [Fri, 24 Feb 2012 19:07:29 +0000 (20:07 +0100)]
epoll: ep_unregister_pollwait() can use the freed pwq->whead

signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
this is not enough. eppoll_entry->whead still points to the memory
we are going to free, ep_unregister_pollwait()->remove_wait_queue()
is obviously unsafe.

Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
change ep_unregister_pollwait() to check pwq->whead != NULL under
rcu_read_lock() before remove_wait_queue(). We add the new helper,
ep_remove_wait_queue(), for this.

This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
ep_unregister_pollwait()->remove_wait_queue() can play with already
freed and potentially reused ->sighand, but this is fine. This memory
must have the valid ->signalfd_wqh until rcu_read_unlock().

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoepoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
Oleg Nesterov [Fri, 24 Feb 2012 19:07:11 +0000 (20:07 +0100)]
epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()

This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

- we do not have wake_up_all_poll() but wake_up_poll()
  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

- signalfd_cleanup() uses POLLHUP along with POLLFREE,
  we need a couple of simple changes in eventpoll.c to
  make sure it can't be "lost".

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Fri, 24 Feb 2012 17:02:53 +0000 (09:02 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Quoth Chris:
 "This is later than I wanted because I got backed up running through
  btrfs bugs from the Oracle QA teams.  But they are all bug fixes that
  we've queued and tested since rc1.

  Nothing in particular stands out, this just reflects bug fixing and QA
  done in parallel by all the btrfs developers.  The most user visible
  of these is:

    Btrfs: clear the extent uptodate bits during parent transid failures

  Because that helps deal with out of date drives (say an iscsi disk
  that has gone away and come back).  The old code wasn't always
  properly retrying the other mirror for this type of failure."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (24 commits)
  Btrfs: fix compiler warnings on 32 bit systems
  Btrfs: increase the global block reserve estimates
  Btrfs: clear the extent uptodate bits during parent transid failures
  Btrfs: add extra sanity checks on the path names in btrfs_mksubvol
  Btrfs: make sure we update latest_bdev
  Btrfs: improve error handling for btrfs_insert_dir_item callers
  Btrfs: be less strict on finding next node in clear_extent_bit
  Btrfs: fix a bug on overcommit stuff
  Btrfs: kick out redundant stuff in convert_extent_bit
  Btrfs: skip states when they does not contain bits to clear
  Btrfs: check return value of lookup_extent_mapping() correctly
  Btrfs: fix deadlock on page lock when doing auto-defragment
  Btrfs: fix return value check of extent_io_ops
  btrfs: honor umask when creating subvol root
  btrfs: silence warning in raid array setup
  btrfs: fix structs where bitfields and spinlock/atomic share 8B word
  btrfs: delalloc for page dirtied out-of-band in fixup worker
  Btrfs: fix memory leak in load_free_space_cache()
  btrfs: don't check DUP chunks twice
  Btrfs: fix trim 0 bytes after a device delete
  ...

12 years agoMerge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming
Linus Torvalds [Fri, 24 Feb 2012 17:01:46 +0000 (09:01 -0800)]
Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

This is the arch/c6x part of commit 7c43185138cf ("Kbuild: Use dtc's -d
(dependency) option") which was dropped because c6x had not yet been
merged at the time.

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  Kbuild: Use dtc's -d (dependency) option

12 years agoMAINTAINERS: drop me from PA-RISC maintenance
Kyle McMartin [Fri, 24 Feb 2012 15:36:16 +0000 (10:36 -0500)]
MAINTAINERS: drop me from PA-RISC maintenance

I don't even live in the same country as any of my PA-RISC hardware
these days, so the odds of me touching the code are pretty low.
(Also re-order things to ensure jejb gets CC'd since he's been the
primary maintainer for the last few years.)

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoNOMMU: Don't need to clear vm_mm when deleting a VMA
David Howells [Thu, 23 Feb 2012 13:51:00 +0000 (13:51 +0000)]
NOMMU: Don't need to clear vm_mm when deleting a VMA

Don't clear vm_mm in a deleted VMA as it's unnecessary and might
conceivably break the filesystem or driver VMA close routine.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoNOMMU: Lock i_mmap_mutex for access to the VMA prio list
David Howells [Thu, 23 Feb 2012 13:50:35 +0000 (13:50 +0000)]
NOMMU: Lock i_mmap_mutex for access to the VMA prio list

Lock i_mmap_mutex for access to the VMA prio list to prevent concurrent
access.  Currently, certain parts of the mmap handling are protected by
the region mutex, but not all.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh
Linus Torvalds [Fri, 24 Feb 2012 16:57:22 +0000 (08:57 -0800)]
Merge tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh

SH/R-Mobile fixes for 3.3-rc5

* tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh:
  arch/arm/mach-shmobile/board-ag5evm.c: included linux/dma-mapping.h twice
  ARM: mach-shmobile: r8a7779 PFC IPSR4 fix
  ARM: mach-shmobile: sh73a0 PSTR 32-bit access fix
  ARM: mach-shmobile: add GPIO-to-IRQ translation to sh7372
  ARM: mach-shmobile: clock-sh73a0: add DSIxPHY clock support
  arm: fix compile failure in mach-shmobile/board-ag5evm.c
  ARM: mach-shmobile: mackerel: add ak4642 amixer settings on comment
  ARM: mach-shmobile: mackerel: use renesas_usbhs instead of r8a66597_hcd
  ARM: mach-shmobile: simplify MMCIF DMA configuration
  ARM: mach-shmobile: IRQ driven GPIO key support for Kota2
  ARM: mach-shmobile: sh73a0 IRQ sparse alloc fix
  ARM: mach-shmobile: sh73a0 PINT IRQ base fix

12 years agoMerge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Linus Torvalds [Fri, 24 Feb 2012 16:56:51 +0000 (08:56 -0800)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

SuperH fixes for 3.3-rc5

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH
  sh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr
  arch/sh: remove references to cpu_*_map.
  sh: Fix typo in pci-sh7780.c
  sh: add platform_device for SPI1 in setup-sh7757
  sh: modify resource for SPI0 in setup-sh7757
  sh: se7724: fix compile breakage
  sh: clkfwk: bugfix: use clk_reparent() for div6 clocks
  sh: clock-sh7724: fixup sh_fsi clock settings
  sh: sh7757lcr: update to the new MMCIF DMA configuration
  sh: fix the sh_mmcif_plat_data in board-sh7757lcr
  video: pvr2fb: Fix up spurious section mismatch warnings.
  sh: Defer to asm-generic/device.h.

12 years agomm: memcg: Correct unregistring of events attached to the same eventfd
Anton Vorontsov [Fri, 24 Feb 2012 01:14:46 +0000 (05:14 +0400)]
mm: memcg: Correct unregistring of events attached to the same eventfd

There is an issue when memcg unregisters events that were attached to
the same eventfd:

- On the first call mem_cgroup_usage_unregister_event() removes all
  events attached to a given eventfd, and if there were no events left,
  thresholds->primary would become NULL;

- Since there were several events registered, cgroups core will call
  mem_cgroup_usage_unregister_event() again, but now kernel will oops,
  as the function doesn't expect that threshold->primary may be NULL.

That's a good question whether mem_cgroup_usage_unregister_event()
should actually remove all events in one go, but nowadays it can't
do any better as cftype->unregister_event callback doesn't pass
any private event-associated cookie. So, let's fix the issue by
simply checking for threshold->primary.

FWIW, w/o the patch the following oops may be observed:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
 IP: [<ffffffff810be32c>] mem_cgroup_usage_unregister_event+0x9c/0x1f0
 Pid: 574, comm: kworker/0:2 Not tainted 3.3.0-rc4+ #9 Bochs Bochs
 RIP: 0010:[<ffffffff810be32c>]  [<ffffffff810be32c>] mem_cgroup_usage_unregister_event+0x9c/0x1f0
 RSP: 0018:ffff88001d0b9d60  EFLAGS: 00010246
 Process kworker/0:2 (pid: 574, threadinfo ffff88001d0b8000, task ffff88001de91cc0)
 Call Trace:
  [<ffffffff8107092b>] cgroup_event_remove+0x2b/0x60
  [<ffffffff8103db94>] process_one_work+0x174/0x450
  [<ffffffff8103e413>] worker_thread+0x123/0x2d0

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agohwmon: (max34440) Fix resetting temperature history
Guenter Roeck [Fri, 24 Feb 2012 11:44:34 +0000 (03:44 -0800)]
hwmon: (max34440) Fix resetting temperature history

Temperature history is reset by writing 0x8000 into the peak temperature
register, not 0xffff.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
12 years agoBtrfs: fix compiler warnings on 32 bit systems
Chris Mason [Fri, 24 Feb 2012 15:39:05 +0000 (10:39 -0500)]
Btrfs: fix compiler warnings on 32 bit systems

The enospc tracing code added some interesting uses of
u64 pointer casts.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agonetfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
Jozsef Kadlecsik [Fri, 24 Feb 2012 10:45:49 +0000 (11:45 +0100)]
netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)

Marcell Zambo and Janos Farago noticed and reported that when
new conntrack entries are added via netlink and the conntrack table
gets full, soft lockup happens. This is because the nf_conntrack_lock
is held while nf_conntrack_alloc is called, which is in turn wants
to lock nf_conntrack_lock while evicting entries from the full table.

The patch fixes the soft lockup with limiting the holding of the
nf_conntrack_lock to the minimum, where it's absolutely required.
It required to extend (and thus change) nf_conntrack_hash_insert
so that it makes sure conntrack and ctnetlink do not add the same entry
twice to the conntrack table.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoRevert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
Pablo Neira Ayuso [Fri, 24 Feb 2012 11:18:38 +0000 (12:18 +0100)]
Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"

This reverts commit af14cca162ddcdea017b648c21b9b091e4bf1fa4.

This patch contains a race condition between packets and ctnetlink
in the conntrack addition. A new patch to fix this issue follows up.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoMerge branch 'master' of /Volumes/CaseSensitiveDisk/linux
Anton Altaparmakov [Fri, 24 Feb 2012 09:39:16 +0000 (09:39 +0000)]
Merge branch 'master' of /Volumes/CaseSensitiveDisk/linux

12 years agoLDM: Fix reassembly of extended VBLKs.
Anton Altaparmakov [Fri, 24 Feb 2012 09:37:42 +0000 (09:37 +0000)]
LDM: Fix reassembly of extended VBLKs.

From: Ben Hutchings <ben@decadent.org.uk>

Extended VBLKs (those larger than the preset VBLK size) are divided
into fragments, each with its own VBLK header.  Our LDM implementation
generally assumes that each VBLK is contiguous in memory, so these
fragments must be assembled before further processing.

Currently the reassembly seems to be done quite wrongly - no VBLK
header is copied into the contiguous buffer, and the length of the
header is subtracted twice from each fragment.  Also the total
length of the reassembled VBLK is calculated incorrectly.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
12 years agoNTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.
Anton Altaparmakov [Fri, 24 Feb 2012 09:17:09 +0000 (09:17 +0000)]
NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.

From: Masanari Iida <standby24x7@gmail.com>

Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
12 years agostatic keys: Introduce 'struct static_key', static_key_true()/false() and static_key_...
Ingo Molnar [Fri, 24 Feb 2012 07:31:31 +0000 (08:31 +0100)]
static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()

So here's a boot tested patch on top of Jason's series that does
all the cleanups I talked about and turns jump labels into a
more intuitive to use facility. It should also address the
various misconceptions and confusions that surround jump labels.

Typical usage scenarios:

        #include <linux/static_key.h>

        struct static_key key = STATIC_KEY_INIT_TRUE;

        if (static_key_false(&key))
                do unlikely code
        else
                do likely code

Or:

        if (static_key_true(&key))
                do likely code
        else
                do unlikely code

The static key is modified via:

        static_key_slow_inc(&key);
        ...
        static_key_slow_dec(&key);

The 'slow' prefix makes it abundantly clear that this is an
expensive operation.

I've updated all in-kernel code to use this everywhere. Note
that I (intentionally) have not pushed through the rename
blindly through to the lowest levels: the actual jump-label
patching arch facility should be named like that, so we want to
decouple jump labels from the static-key facility a bit.

On non-jump-label enabled architectures static keys default to
likely()/unlikely() branches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: a.p.zijlstra@chello.nl
Cc: mathieu.desnoyers@efficios.com
Cc: davem@davemloft.net
Cc: ddaney.cavm@gmail.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agodavinci_emac: Do not free all rx dma descriptors during init
Christian Riesch [Thu, 23 Feb 2012 01:14:17 +0000 (01:14 +0000)]
davinci_emac: Do not free all rx dma descriptors during init

This patch fixes a regression that was introduced by

commit 0a5f38467765ee15478db90d81e40c269c8dda20
davinci_emac: Add Carrier Link OK check in Davinci RX Handler

Said commit adds a check whether the carrier link is ok. If the link is
not ok, the skb is freed and no new dma descriptor added to the rx dma
channel. This causes trouble during initialization when the carrier
status has not yet been updated. If a lot of packets are received while
netif_carrier_ok returns false, all dma descriptors are freed and the
rx dma transfer is stopped.

The bug occurs when the board is connected to a network with lots of
traffic and the ifconfig down/up is done, e.g., when reconfiguring
the interface with DHCP.

The bug can be reproduced by flood pinging the davinci board while doing
ifconfig eth0 down
ifconfig eth0 up
on the board.

After that, the rx path stops working and the overrun value reported
by ifconfig is counting up.

This patch reverts commit 0a5f38467765ee15478db90d81e40c269c8dda20
and instead issues warnings only if cpdma_chan_submit returns -ENOMEM.

Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Cc: <stable@vger.kernel.org>
Cc: Hegde, Vinay <vinay.hegde@ti.com>
Cc: Cyril Chemparathy <cyril@ti.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Rajashekhara, Sudhakar <sudhakar.raj@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agostatic keys: Add docs better explaining the whole 'struct static_key' mechanism
Jason Baron [Tue, 21 Feb 2012 20:03:30 +0000 (15:03 -0500)]
static keys: Add docs better explaining the whole 'struct static_key' mechanism

Add better documentation for static keys.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: davem@davemloft.net
Cc: ddaney.cavm@gmail.com
Cc: a.p.zijlstra@chello.nl
Link: http://lkml.kernel.org/r/52570e566e5f1914f27b67e4eafb5781b8f9f9db.1329851692.git.jbaron@redhat.com
[ Added a 'Summary' section and rewrote it to explain static keys ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agomlx4_core: Fixing array indexes when setting port types
Yevgeny Petrilin [Thu, 23 Feb 2012 07:04:35 +0000 (07:04 +0000)]
mlx4_core: Fixing array indexes when setting port types

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoarch/arm/mach-shmobile/board-ag5evm.c: included linux/dma-mapping.h twice
Danny Kukawka [Thu, 16 Feb 2012 14:46:38 +0000 (15:46 +0100)]
arch/arm/mach-shmobile/board-ag5evm.c: included linux/dma-mapping.h twice

arch/arm/mach-shmobile/board-ag5evm.c: included 'linux/dma-mapping.h'
twice, remove the duplicate.

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agoARM: mach-shmobile: r8a7779 PFC IPSR4 fix
Magnus Damm [Mon, 30 Jan 2012 02:25:07 +0000 (11:25 +0900)]
ARM: mach-shmobile: r8a7779 PFC IPSR4 fix

Fix the bit field width information for the IPSR4 register
in the r8a7779 pin function controller (PFC).

Without this fix the Marzen board fails to receive data
over the serial console due to misconfigured pin function
for the RX pin.

Signed-off-by: Magnus Damm <damm@opensource.se>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agoARM: mach-shmobile: sh73a0 PSTR 32-bit access fix
Magnus Damm [Mon, 30 Jan 2012 02:03:49 +0000 (11:03 +0900)]
ARM: mach-shmobile: sh73a0 PSTR 32-bit access fix

Convert the sh73a0 SMP code to use 32-bit PSTR access.

This fixes wakeup from deep sleep for sh73a0 secondary CPUs.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into rmobile-fixes...
Paul Mundt [Fri, 24 Feb 2012 04:23:23 +0000 (13:23 +0900)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into rmobile-fixes-for-linus

12 years agosh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH
Phil Edworthy [Tue, 21 Feb 2012 08:29:57 +0000 (08:29 +0000)]
sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH

Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agosh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr
Shimoda, Yoshihiro [Mon, 20 Feb 2012 08:26:50 +0000 (17:26 +0900)]
sh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr

The latest sh_eth driver needs a resource of TSU in the channel 1,
if the controller has TSU registers. So, this patch adds the resource.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agoarch/sh: remove references to cpu_*_map.
Rusty Russell [Wed, 15 Feb 2012 04:58:04 +0000 (15:28 +1030)]
arch/sh: remove references to cpu_*_map.

This has been obsolescent for a while; time for the final push.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agosh: Fix typo in pci-sh7780.c
Masanari Iida [Sat, 4 Feb 2012 12:40:24 +0000 (21:40 +0900)]
sh: Fix typo in pci-sh7780.c

Correct spelling "erorr" to "error" in
arch/sh/drivers/pci/pci-sh7780.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
12 years agoRestore direct_io / truncate locking API
Anton Altaparmakov [Thu, 23 Feb 2012 23:40:05 +0000 (23:40 +0000)]
Restore direct_io / truncate locking API

With kernel 3.1, Christoph removed i_alloc_sem and replaced it with
calls (namely inode_dio_wait() and inode_dio_done()) which are
EXPORT_SYMBOL_GPL() thus they cannot be used by non-GPL file systems and
further inode_dio_wait() was pushed from notify_change() into the file
system ->setattr() method but no non-GPL file system can make this call.

That means non-GPL file systems cannot exist any more unless they do not
use any VFS functionality related to reading/writing as far as I can
tell or at least as long as they want to implement direct i/o.

Both Linus and Al (and others) have said on LKML that this breakage of
the VFS API should not have happened and that the change was simply
missed as it was not documented in the change logs of the patches that
did those changes.

This patch changes the two function exports in question to be
EXPORT_SYMBOL() thus restoring the VFS API as it used to be - accessible
for all modules.

Christoph, who introduced the two functions and exported them GPL-only
is CC-ed on this patch to give him the opportunity to object to the
symbols being changed in this manner if he did indeed intend them to be
GPL-only and does not want them to become available to all modules.

Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
CC: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
Linus Torvalds [Thu, 23 Feb 2012 23:38:57 +0000 (15:38 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs

A fix from Jesper Juhl removes an assignment in an ASSERT when a compare
is intended.  Two fixes from Mitsuo Hayasaka address off-by-ones in XFS
quota enforcement.

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: make inode quota check more general
  xfs: change available ranges of softlimit and hardlimit in quota check
  XFS: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended

12 years agophy: IC+101G and PHY_HAS_INTERRUPT flag
Giuseppe CAVALLARO [Tue, 21 Feb 2012 21:26:28 +0000 (21:26 +0000)]
phy: IC+101G and PHY_HAS_INTERRUPT flag

This patch adds the PHY_HAS_INTERRUPT flag for IC+101 device series.
Also the patch does a simple dity-up to signal that
the driver actually is for IP101A LF and IP101G devices.
In fact, these are two similar PHYs that have the same IDs
and mainly differ for the EEE capability supported in the
G series.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetdev/phy/icplus: Correct broken phy_init code
David McKay [Tue, 21 Feb 2012 21:24:57 +0000 (21:24 +0000)]
netdev/phy/icplus: Correct broken phy_init code

The code for ip1001_config_init() was totally broken if you were not
using RGMII. Instead of returning an error code or zero it actually
returned the value in the IP1001_SPEC_CTRL_STATUS_2 register. It was
also trying to set the IP1001_APS_ON bit , but never actually wrote
back the register.

The error checking was also incorrect in both this function and the
reset function, so this patch fixes that up in a consistent fashion.

Signed-off-by: David McKay <david.mckay@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipsec: be careful of non existing mac headers
Eric Dumazet [Thu, 23 Feb 2012 10:55:02 +0000 (10:55 +0000)]
ipsec: be careful of non existing mac headers

Niccolo Belli reported ipsec crashes in case we handle a frame without
mac header (atm in his case)

Before copying mac header, better make sure it is present.

Bugzilla reference:  https://bugzilla.kernel.org/show_bug.cgi?id=42809

Reported-by: Niccolò Belli <darkbasic@linuxsystems.it>
Tested-by: Niccolò Belli <darkbasic@linuxsystems.it>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Thu, 23 Feb 2012 19:48:36 +0000 (11:48 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

BenH says:
 'Here are a few more powerpc bits for you.  A stupid regression I
  introduced with my previous commit to "fix" program check exceptions
  (brown paper bag for me), fix the cpuidle default, a bug fix for
  something that isn't strictly speaking a regression but some upstream
  changes causes it to show in lockdep now while it didn't before, and
  finally a trivial one for rusty to make his life easier later on
  removing the old cpumask cruft. '

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix various issues with return to userspace
  cpuidle: Default y on powerpc pSeries
  powerpc: Fix program check handling when lockdep is enabled
  powerpc: Remove references to cpu_*_map

12 years agoMerge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Thu, 23 Feb 2012 19:28:05 +0000 (11:28 -0800)]
Merge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

sound fixes for 3.3-rc5

Just a collection of boring small fixes for ASoC, HD-audio Realtek
and USB-audio drivers.

* tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: snd-usb-caiaq: Fix the return of XRUN
  ASoC: ak4642: fixup HeadPhone L/R dapm settings
  ALSA: hda/realtek - Fix surround output regression on Acer Aspire 5935
  ALSA: hda/realtek - Fix overflow of vol/sw check bitmap
  ALSA: usb-audio: avoid integer overflow in create_fixed_stream_quirk()
  ASoC: wm8962: Fix sidetone enumeration texts

12 years agoBtrfs: increase the global block reserve estimates
Liu Bo [Thu, 23 Feb 2012 15:49:04 +0000 (10:49 -0500)]
Btrfs: increase the global block reserve estimates

When doing IO with large amounts of data fragmentation, the global block
reserve calulations are too low.  This increases them to avoid
ENOSPC crashes.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agoBtrfs: clear the extent uptodate bits during parent transid failures
Chris Mason [Wed, 22 Feb 2012 17:36:24 +0000 (12:36 -0500)]
Btrfs: clear the extent uptodate bits during parent transid failures

If btrfs reads a block and finds a parent transid mismatch, it clears
the uptodate flags on the extent buffer, and the pages inside it.  But
we only clear the uptodate bits in the state tree if the block straddles
more than one page.

This is from an old optimization from to reduce contention on the extent
state tree.  But it is buggy because the code that retries a read from
a different copy of the block is going to find the uptodate state bits
set and skip the IO.

The end result of the bug is that we'll never actually read the good
copy (if there is one).

The fix here is to always clear the uptodate state bits, which is safe
because this code is only called when the parent transid fails.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agoBtrfs: add extra sanity checks on the path names in btrfs_mksubvol
Chris Mason [Tue, 21 Feb 2012 03:14:55 +0000 (22:14 -0500)]
Btrfs: add extra sanity checks on the path names in btrfs_mksubvol

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agoBtrfs: make sure we update latest_bdev
Chris Mason [Tue, 21 Feb 2012 01:53:43 +0000 (20:53 -0500)]
Btrfs: make sure we update latest_bdev

When we are setting up the mount, we close all the
devices that were not actually part of the metadata we found.

But, we don't make sure that one of those devices wasn't
fs_devices->latest_bdev, which means we can do a use after free
on the one we closed.

This updates latest_bdev as it goes.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agoBtrfs: improve error handling for btrfs_insert_dir_item callers
Chris Mason [Mon, 20 Feb 2012 13:40:56 +0000 (08:40 -0500)]
Btrfs: improve error handling for btrfs_insert_dir_item callers

This allows us to gracefully continue if we aren't able to insert
directory items, both for normal files/dirs and snapshots.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agohwmon: (f75375s) Fix register write order when setting fans to full speed
Nikolaus Schulz [Wed, 22 Feb 2012 22:18:44 +0000 (23:18 +0100)]
hwmon: (f75375s) Fix register write order when setting fans to full speed

By hwmon sysfs interface convention, setting pwm_enable to zero sets a fan
to full speed.  In the f75375s driver, this need be done by enabling
manual fan control, plus duty mode for the F875387 chip, and then setting
the maximum duty cycle.  Fix a bug where the two necessary register writes
were swapped, effectively discarding the setting to full-speed.

Signed-off-by: Nikolaus Schulz <mail@microschulz.de>
Cc: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>