rtnetlink: rtnl_bridge_getlink: Call nlmsg_find_attr() with ifinfomsg header
Fix the iproute2 command `bridge vlan show`, after switching from
rtgenmsg to ifinfomsg.
Let's start with a little history:
Feb 20: Vlad Yasevich got his VLAN-aware bridge patchset included in
the 3.9 merge window.
In the kernel commit 6cbdceeb, he added attribute support to
bridge GETLINK requests sent with rtgenmsg.
Mar 6th: Vlad got this iproute2 reference implementation of the bridge
vlan netlink interface accepted (iproute2 9eff0e5c)
Apr 25th: iproute2 switched from using rtgenmsg to ifinfomsg (63338dca)
http://patchwork.ozlabs.org/patch/239602/
http://marc.info/?t=136680900700007
Apr 28th: Linus released 3.9
Apr 30th: Stephen released iproute2 3.9.0
The `bridge vlan show` command haven't been working since the switch to
ifinfomsg, or in a released version of iproute2. Since the kernel side
only supports rtgenmsg, which iproute2 switched away from just prior to
the iproute2 3.9.0 release.
I haven't been able to find any documentation, about neither rtgenmsg
nor ifinfomsg, and in which situation to use which, but kernel commit 88c5b5ce seams to suggest that ifinfomsg should be used.
Fixing this in kernel will break compatibility, but I doubt that anybody
have been using it due to this bug in the user space reference
implementation, at least not without noticing this bug. That said the
functionality is still fully functional in 3.9, when reversing iproute2
commit 63338dca.
This could also be fixed in iproute2, but thats an ugly patch that would
reintroduce rtgenmsg in iproute2, and from searching in netdev it seams
like rtgenmsg usage is discouraged. I'm assuming that the only reason
that Vlad implemented the kernel side to use rtgenmsg, was because
iproute2 was using it at the time.
Signed-off-by: Asbjoern Sloth Toennesen <ast@fiberby.net> Reviewed-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
yonghua zheng [Tue, 13 Aug 2013 23:01:03 +0000 (16:01 -0700)]
fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
Recently we met quite a lot of random kernel panic issues after enabling
CONFIG_PROC_PAGE_MONITOR. After debuggind we found this has something
to do with following bug in pagemap:
pos is number of PM_ENTRY_BYTES in buffer, but len is the size of
buffer, it is a mistake to compare pos and len in add_page_map() for
checking buffer is full or not, and this can lead to buffer overflow and
random kernel panic issue.
Correct len to be total number of PM_ENTRY_BYTES in buffer.
Chen Gang [Tue, 13 Aug 2013 23:01:02 +0000 (16:01 -0700)]
arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
All architectures include "kernel/Kconfig.freezer" except three left, so
let them include it too, or 'allmodconfig' will report error.
The related errors: (with allmodconfig for openrisc):
CC kernel/cgroup_freezer.o
kernel/cgroup_freezer.c: In function 'freezer_css_online':
kernel/cgroup_freezer.c:133:15: error: 'system_freezing_cnt' undeclared (first use in this function)
kernel/cgroup_freezer.c:133:15: note: each undeclared identifier is reported only once for each function it appears in
kernel/cgroup_freezer.c: In function 'freezer_css_offline':
kernel/cgroup_freezer.c:157:15: error: 'system_freezing_cnt' undeclared (first use in this function)
kernel/cgroup_freezer.c: In function 'freezer_attach':
kernel/cgroup_freezer.c:200:4: error: implicit declaration of function 'freeze_task'
kernel/cgroup_freezer.c: In function 'freezer_apply_state':
kernel/cgroup_freezer.c:371:16: error: 'system_freezing_cnt' undeclared (first use in this function)
Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dan.carpenter@oracle.com: fix pointer math] Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reported-by: David Weber <wb@munzinger.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Radu Caragea [Tue, 13 Aug 2013 23:00:59 +0000 (16:00 -0700)]
x86 get_unmapped_area(): use proper mmap base for bottom-up direction
When the stack is set to unlimited, the bottomup direction is used for
mmap-ings but the mmap_base is not used and thus effectively renders
ASLR for mmapings along with PIE useless.
Cc: Michel Lespinasse <walken@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Sendroiu <molecula2788@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tiger Yang [Tue, 13 Aug 2013 23:00:58 +0000 (16:00 -0700)]
ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
Since ocfs2_cow_file_pos will invoke ocfs2_refcount_icow with a NULL as
the struct file pointer, it finally result in a null pointer dereference
in ocfs2_duplicate_clusters_by_page.
This patch replace file pointer with inode pointer in
cow_duplicate_clusters to fix this issue.
[jeff.liu@oracle.com: rebased patch against linux-next tree] Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Jie Liu <jeff.liu@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Acked-by: Tao Ma <tm@tao.ma> Tested-by: David Weber <wb@munzinger.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jie Liu [Tue, 13 Aug 2013 23:00:57 +0000 (16:00 -0700)]
ocfs2: Revert 40bd62e to avoid regression in extended allocation
Revert commit 40bd62eb7fb8 ("fs/ocfs2/journal.h: add bits_wanted while
calculating credits in ocfs2_calc_extend_credits").
Unfortunately this change broke fallocate even if there is insufficient
disk space for the preallocation, which is a serious problem.
# df -h
/dev/sda8 22G 1.2G 21G 6% /ocfs2
# fallocate -o 0 -l 200M /ocfs2/testfile
fallocate: /ocfs2/test: fallocate failed: No space left on device
and a kernel warning:
CPU: 3 PID: 3656 Comm: fallocate Tainted: G W O 3.11.0-rc3 #2
Call Trace:
dump_stack+0x77/0x9e
warn_slowpath_common+0xc4/0x110
warn_slowpath_null+0x2a/0x40
start_this_handle+0x6c/0x640 [jbd2]
jbd2__journal_start+0x138/0x300 [jbd2]
jbd2_journal_start+0x23/0x30 [jbd2]
ocfs2_start_trans+0x166/0x300 [ocfs2]
__ocfs2_extend_allocation+0x38f/0xdb0 [ocfs2]
ocfs2_allocate_unwritten_extents+0x3c9/0x520
__ocfs2_change_file_space+0x5e0/0xa60 [ocfs2]
ocfs2_fallocate+0xb1/0xe0 [ocfs2]
do_fallocate+0x1cb/0x220
SyS_fallocate+0x6f/0xb0
system_call_fastpath+0x16/0x1b
JBD2: fallocate wants too many credits (51216 > 4381)
Signed-off-by: Jie Liu <jeff.liu@oracle.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lothar Waßmann [Tue, 13 Aug 2013 23:00:56 +0000 (16:00 -0700)]
drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
It's always a bad idea to poll on HW bits without a timeout.
The i.MX28 RTC can be easily brought into a state in which the RTC is
not running (until after a power-on-reset) and thus the status bits
which are polled in the driver won't ever change.
This patch prevents the kernel from getting stuck in this case.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Acked-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Tue, 13 Aug 2013 23:00:55 +0000 (16:00 -0700)]
hugetlb: fix lockdep splat caused by pmd sharing
Dave has reported the following lockdep splat:
=================================
[ INFO: inconsistent lock state ]
3.11.0-rc1+ #9 Not tainted
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&mapping->i_mmap_mutex){+.+.?.}, at: [<c114971b>] page_referenced+0x87/0x5e3
{RECLAIM_FS-ON-W} state was registered at:
mark_held_locks+0x81/0xe7
lockdep_trace_alloc+0x5e/0xbc
__alloc_pages_nodemask+0x8b/0x9b6
__get_free_pages+0x20/0x31
get_zeroed_page+0x12/0x14
__pmd_alloc+0x1c/0x6b
huge_pmd_share+0x265/0x283
huge_pte_alloc+0x5d/0x71
hugetlb_fault+0x7c/0x64a
handle_mm_fault+0x255/0x299
__do_page_fault+0x142/0x55c
do_page_fault+0xd/0x16
error_code+0x6c/0x74
irq event stamp: 3136917
hardirqs last enabled at (3136917): _raw_spin_unlock_irq+0x27/0x50
hardirqs last disabled at (3136916): _raw_spin_lock_irq+0x15/0x78
softirqs last enabled at (3136180): __do_softirq+0x137/0x30f
softirqs last disabled at (3136175): irq_exit+0xa8/0xaa
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&mapping->i_mmap_mutex);
<Interrupt>
lock(&mapping->i_mmap_mutex);
which is a false positive caused by hugetlb pmd sharing code which
allocates a new pmd from withing mapping->i_mmap_mutex. If this
allocation causes reclaim then the lockdep detector complains that we
might self-deadlock.
This is not correct though, because hugetlb pages are not reclaimable so
their mapping will be never touched from the reclaim path.
The patch tells lockup detector that hugetlb i_mmap_mutex is special by
assigning it a separate lockdep class so it won't report possible
deadlocks on unrelated mappings.
[peterz@infradead.org: comment for annotation] Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ed Cashin [Tue, 13 Aug 2013 23:00:53 +0000 (16:00 -0700)]
aoe: adjust ref of head for compound page tails
Fix a BUG which can trigger when direct-IO is used with AOE.
As discussed previously, the fact that some users of the block layer
provide bios that point to pages with a zero _count means that it is not
OK for the network layer to do a put_page on the skb frags during an
skb_linearize, so the aoe driver gets a reference to pages in bios and
puts the reference before ending the bio. And because it cannot use
get_page on a page with a zero _count, it manipulates the value
directly.
It is not OK to increment the _count of a compound page tail, though,
since the VM layer will VM_BUG_ON a non-zero _count. Block users that
do direct I/O can result in the aoe driver seeing compound page tails in
bios. In that case, the same logic works as long as the head of the
compound page is used instead of the tails. This patch handles compound
pages and does not BUG.
It relies on the block layer user leaving the relationship between the
page tail and its head alone for the duration between the submission of
the bio and its completion, whether successful or not.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Simek [Tue, 13 Aug 2013 23:00:53 +0000 (16:00 -0700)]
microblaze: fix clone syscall
Fix inadvertent breakage in the clone syscall ABI for Microblaze that
was introduced in commit f3268edbe6fe ("microblaze: switch to generic
fork/vfork/clone").
The Microblaze syscall ABI for clone takes the parent tid address in the
4th argument; the third argument slot is used for the stack size. The
incorrectly-used CLONE_BACKWARDS type assigned parent tid to the 3rd
slot.
This commit restores the original ABI so that existing userspace libc
code will work correctly.
All kernel versions from v3.8-rc1 were affected.
Signed-off-by: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyrill Gorcunov [Tue, 13 Aug 2013 23:00:51 +0000 (16:00 -0700)]
mm: save soft-dirty bits on file pages
Andy reported that if file page get reclaimed we lose the soft-dirty bit
if it was there, so save _PAGE_BIT_SOFT_DIRTY bit when page address get
encoded into pte entry. Thus when #pf happens on such non-present pte
we can restore it back.
Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyrill Gorcunov [Tue, 13 Aug 2013 23:00:49 +0000 (16:00 -0700)]
mm: save soft-dirty bits on swapped pages
Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
get swapped out, the bit is getting lost and no longer available when
pte read back.
To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
pte entry for the page being swapped out. When such page is to be read
back from a swap cache we check for bit presence and if it's there we
clear it and restore the former _PAGE_SOFT_DIRTY bit back.
One of the problem was to find a place in pte entry where we can save
the _PTE_SWP_SOFT_DIRTY bit while page is in swap. The _PAGE_PSE was
chosen for that, it doesn't intersect with swap entry format stored in
pte.
Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Vagin [Tue, 13 Aug 2013 23:00:47 +0000 (16:00 -0700)]
memcg: don't initialize kmem-cache destroying work for root caches
struct memcg_cache_params has a union. Different parts of this union
are used for root and non-root caches. A part with destroying work is
used only for non-root caches.
I fixed the same problem in another place v3.9-rc1-16204-gf101a94, but
didn't notice this one.
ipv6: make unsolicited report intervals configurable for mld
Commit cab70040dfd95ee32144f02fade64f0cb94f31a0 ("net: igmp:
Reduce Unsolicited report interval to 1s when using IGMPv3") and 2690048c01f32bf45d1c1e1ab3079bc10ad2aea7 ("net: igmp: Allow user-space
configuration of igmp unsolicited report interval") by William Manley made
igmp unsolicited report intervals configurable per interface and corrected
the interval of unsolicited igmpv3 report messages resendings to 1s.
Same needs to be done for IPv6:
MLDv1 (RFC2710 7.10.): 10 seconds
MLDv2 (RFC3810 9.11.): 1 second
Both intervals are configurable via new procfs knobs
mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.
(also added .force_mld_version to ipv6_devconf_dflt to bring structs in
line without semantic changes)
v2:
a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
unsolicited_report_interval procfs knobs.
b) incorporate stylistic feedback from William Manley
v3:
a) add new DEVCONF_* values to the end of the enum (thanks to David
Miller)
Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: William Manley <william.manley@youview.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Brodkin [Tue, 13 Aug 2013 13:04:36 +0000 (17:04 +0400)]
ethernet/arc/arc_emac - fix NAPI "work > weight" warning
Initially I improperly set a boundary for maximum number of input
packets to process on NAPI poll ("work") so it might be more than
expected amount ("weight").
This was really harmless but seeing WARN_ON_ONCE on every device boot is
not nice. So trivial fix ("<" instead of "<=") is here.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Mischa Jonker <mjonker@synopsys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 13 Aug 2013 23:58:17 +0000 (16:58 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Docbook fixes that make 99% of the diffstat, plus a oneliner fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Ensure update_cfs_shares() is called for parents of continuously-running tasks
sched: Fix some kernel-doc warnings
Linus Torvalds [Tue, 13 Aug 2013 23:57:40 +0000 (16:57 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two small fixlets"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Add Haswell ULT model number used in Macbook Air and other systems
perf/x86: Fix intel QPI uncore event definitions
Pravin B Shelar [Tue, 13 Aug 2013 08:41:06 +0000 (01:41 -0700)]
ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.
Following patch reverts optimization from commit 490ab08127 (IP_GRE: Fix IP-Identification.)
CC: Jarno Rajahalme <jrajahalme@nicira.com> CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Mon, 12 Aug 2013 23:41:25 +0000 (16:41 -0700)]
tcp: reset reordering est. selectively on timeout
On timeout the TCP sender unconditionally resets the estimated degree
of network reordering (tp->reordering). The idea behind this is that
the estimate is too large to trigger fast recovery (e.g., due to a IP
path change).
But for example if the sender only had 2 packets outstanding, then a
timeout doesn't tell much about reordering. A sender that learns about
reordering on big writes and loses packets on small writes will end up
falsely retransmitting again and again, especially when reordering is
more likely on big writes.
Therefore the sender should only suspect that tp->reordering is too
high if it could have gone into fast recovery with the (lower) default
estimate.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 13 Aug 2013 23:04:38 +0000 (16:04 -0700)]
Merge branch 'bnx2x'
Dmitry Kravkov says:
====================
Please consider applying the series of bnx2x fixes to net:
* statistics may cause FW assert
* missing fairness configuration in DCB flow
* memory leak in sriov related part
* Illegal PTE access
* Pagefault crash in shutdown flow with cnic
v1->v2
* fixed sparse error pointed by Joe Perches
* added missing signed-off from Sergei Shtylyov
v2->v3
* added missing signed-off from Sergei Shtylyov
* fixed formatting from Sergei Shtylyov
v3->v4
* patch 1/6: fixed declaration order
* patch 2/6 replaced with: protect flows using set_bit constraints
v4->v5
* patch 2/6: replace proprietary locking with semaphore
* droped 1/6: since adds redundant code from Benjamin Poirier
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 12 Aug 2013 23:25:03 +0000 (02:25 +0300)]
bnx2x: prevent crash in shutdown flow with CNIC
There might be a crash as during shutdown flow CNIC might try
to access resources already freed by bnx2x.
Change bnx2x_close() into dev_close() in __bnx2x_remove (shutdown flow)
to guarantee CNIC is notified of the device's change of status.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowsky [Mon, 12 Aug 2013 23:25:02 +0000 (02:25 +0300)]
bnx2x: fix PTE write access error
PTE write access error might occur in MF_ALLOWED mode when IOMMU
is active. The patch adds rmmod HSI indicating to MFW to stop
running queries which might trigger this failure.
Signed-off-by: Barak Witkowsky <barak@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 12 Aug 2013 23:24:59 +0000 (02:24 +0300)]
bnx2x: protect different statistics flows
Add locking to protect different statistics flows from
running simultaneously.
This in order to serialize statistics requests sent to FW,
otherwise two outstanding queries may cause FW assert.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 13 Aug 2013 22:58:59 +0000 (15:58 -0700)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
This is a batch of updates intended for 3.12. It is mostly driver
stuff, although Johannes Berg and Simon Wunderlich make a good
showing with mac80211 bits (particularly some work on 5/10 MHz
channel support).
The usual suspects are mostly represented. There are lots of updates
to iwlwifi, ath9k, ath10k, mwifiex, rt2x00, wil6210, as usual.
The bcma bus gets some love this time, as do cw1200, iwl4965, and a
few other bits here and there. I don't think there is much unusual
here, FWIW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 12 Aug 2013 11:41:15 +0000 (17:11 +0530)]
drivers: net: cpsw: Add support for new CPSW IP version present in AM43xx SoC
The new IP version which is present in AM43xx SoC has a minor changes and the
offsets are same as the previous version, so adding new IP version support in
the driver.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Boyd [Tue, 13 Aug 2013 21:12:40 +0000 (14:12 -0700)]
PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()
pm_qos_update_request_timeout() updates a qos and then schedules
a delayed work item to bring the qos back down to the default
after the timeout. When the work item runs, pm_qos_work_fn() will
call pm_qos_update_request() and deadlock because it tries to
cancel itself via cancel_delayed_work_sync(). Future callers of
that qos will also hang waiting to cancel the work that is
canceling itself. Let's extract the little bit of code that does
the real work of pm_qos_update_request() and call it from the
work function so that we don't deadlock.
Before ed1ac6e (PM: don't use [delayed_]work_pending()) this didn't
happen because the work function wouldn't try to cancel itself.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: 3.9+ <stable@vger.kernel.org> # 3.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Claudiu Manoil [Mon, 12 Aug 2013 10:53:26 +0000 (13:53 +0300)]
gianfar: Add flow control support
eTSEC has Rx and Tx flow control capabilities that may be enabled
through MACCFG1[Rx_Flow, Tx_Flow] bits. These bits must not be set
however when eTSEC is operated in Half-Duplex mode. Unfortunately,
the driver currently sets these bits unconditionally.
This patch adds the proper handling of the PAUSE frame capability
register bits by implementing the ethtool -A interface. When pause
autoneg is enabled, the controller uses the phy's capability to
negotiate PAUSE frame settings with the link partner and reconfigures
its Rx_Flow and Tx_Flow settings to match the capabilities of the
link partner. If pause autoneg is off, the PAUSE frame generation
may be forced manually (ethtool -A). Flow control is disabled by
default now.
This implementation is inspired by the tg3 driver.
Signed-off-by: Lutz Jaenicke <ljaenicke@innominate.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pptp driver has lots of byte order warnings from sparse.
This was because the on-the-wire header is in network byte order (obviously)
but the definition did not reflect that.
Also, the address structure to user space actually put the call id
in host order. Rather than break ABI compatibility, just acknowledge
the existing design.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The approach taken here is to split data and control chunks
creation a bit. Data chunks already have memory accounting
so noting needs to happen. For control chunks, add stubs handlers.
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Brown [Fri, 9 Aug 2013 17:31:22 +0000 (18:31 +0100)]
net: asix: Move declaration of ax88172a_info to shared header
Ensure that the definition of ax88172a_info matches the declaration seen
by users and silence sparse warnings about symbols without declarations
in the global namespace by moving the declaration into the shared header
asix.h.
Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
clk: exynos4: Add CLK_GET_RATE_NOCACHE flag for the Exynos4x12 ISP clocks
The ISP clock registers belong to the ISP power domain and may change
their values if this power domain is switched off/on. Add
CLK_GET_RATE_NOCACHE flags to ensure we do not rely on invalid cached
data when setting or getting frequency of those clocks.
Without this fix the FIMC-IS Cortex-A5 core and AXI bus clocks have
incorrect frequencies, which breaks the ISP operation and starting the
video pipeline fails with timeouts reported by the FIMC-IS firmware.
Soren Brinkmann [Mon, 17 Jun 2013 22:47:40 +0000 (15:47 -0700)]
clk/zynq/clkc: Add CLK_SET_RATE_PARENT flag to ethernet muxes
Zynq's Ethernet clocks are created by the following hierarchy:
mux0 ---> div0 ---> div1 ---> mux1 ---> gate
Rate change requests on the gate have to propagate all the way up to
div0 to properly leverage all dividers. Mux1 was missing the
CLK_SET_RATE_PARENT flag, which is required to achieve this.
This does not fix a specific regression but the clock driver was merged
for 3.11-rc1, so best to fix the known bugs before the release.
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: added to changelog]
Soren Brinkmann [Mon, 17 Jun 2013 22:03:46 +0000 (15:03 -0700)]
clk/zynq/clkc: Add dedicated spinlock for the SWDT
The clk_mux for the system watchdog timer reused the register lock
dedicated to the Ethernet module - for no apparent reason.
Add a lock dedicated to the SWDT's clock register to remove this
wrong dependency.
This does not fix a specific regression but the clock driver was merged
for 3.11-rc1, so best to fix the known bugs before the release.
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: added to changelog]
can miss a signal. This is the special case of wait-for-condition,
it relies on try_to_wake_up/schedule interaction and thus it does
not need mb() between __set_current_state() and if(signal_pending).
However, this __set_current_state() can move into the critical
section protected by rq->lock, now that try_to_wake_up() takes
another lock we need to ensure that it can't be reordered with
"if (signal_pending(current))" check inside that section.
The patch is actually one-liner, it simply adds smp_wmb() before
spin_lock_irq(rq->lock). This is what try_to_wake_up() already
does by the same reason.
We turn this wmb() into the new helper, smp_mb__before_spinlock(),
for better documentation and to allow the architectures to change
the default implementation.
While at it, kill smp_mb__after_lock(), it has no callers.
Perhaps we can also add smp_mb__before/after_spinunlock() for
prepare_to_wait().
netfilter: nfnetlink_queue: allow to attach expectations to conntracks
This patch adds the capability to attach expectations via nfnetlink_queue.
This is required by conntrack helpers that trigger expectations based on
the first packet seen like the TFTP and the DHCPv6 user-space helpers.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch refactors ctnetlink_create_expect by spliting it in two
chunks. As a result, we have a new function ctnetlink_alloc_expect
to allocate and to setup the expectation from ctnetlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Johannes Berg [Tue, 13 Aug 2013 07:04:05 +0000 (09:04 +0200)]
genetlink: fix family dump race
When dumping generic netlink families, only the first dump call
is locked with genl_lock(), which protects the list of families,
and thus subsequent calls can access the data without locking,
racing against family addition/removal. This can cause a crash.
Fix it - the locking needs to be conditional because the first
time around it's already locked.
A similar bug was reported to me on an old kernel (3.4.47) but
the exact scenario that happened there is no longer possible,
on those kernels the first round wasn't locked either. Looking
at the current code I found the race described above, which had
also existed on the old kernel.
Cc: stable@vger.kernel.org Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Probably this one is quite unlikely to be triggered, but it's more safe
to do the call_rcu() at the end after we have dropped the reference on
the asoc and freed sctp packet chunks. The reason why is because in
sctp_transport_destroy_rcu() the transport is being kfree()'d, and if
we're unlucky enough we could run into corrupted pointers. Probably
that's more of theoretical nature, but it's safer to have this simple fix.
Introduced by commit 8c98653f ("sctp: sctp_close: fix release of bindings
for deferred call_rcu's"). I also did the 8c98653f regression test and
it's fine that way.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Karl Heiss <kheiss@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac: fix init_dma_desc_rings() to handle errors
In stmmac_init_rx_buffers():
* add missing handling of dma_map_single() error
* remove superfluous unlikely() optimization while at it
Add stmmac_free_rx_buffers() helper and use it in dma_free_rx_skbufs().
In init_dma_desc_rings():
* add missing handling of kmalloc_array() errors
* fix handling of dma_alloc_coherent() and stmmac_init_rx_buffers() errors
* make function return an error value on error and 0 on success
In stmmac_open():
* add handling of init_dma_desc_rings() return value
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 12 Aug 2013 22:02:53 +0000 (15:02 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"A set of small cifs fixes, including 3 relating to symlink handling"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: don't instantiate new dentries in readdir for inodes that need to be revalidated immediately
cifs: set sb->s_d_op before calling d_make_root()
cifs: fix bad error handling in crypto code
cifs: file: initialize oparms.reconnect before using it
Do not attempt to do cifs operations reading symlinks with SMB2
cifs: extend the buffer length enought for sprintf() using
Linus Torvalds [Mon, 12 Aug 2013 22:00:40 +0000 (15:00 -0700)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull more ext4 bugfixes from Ted Ts'o:
"A number of miscellaneous ext4 bugs fixes for v3.11, including a fix
so that if ext4 is built as a module, to allow it to be unloaded"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT
ext4: fix mount/remount error messages for incompatible mount options
ext4: allow the mount options nodelalloc and data=journal
Fix endianess bugs in firmware handling introduced by commits cb7a7c6a
("ti_usb_3410_5052: add Multi-Tech modem support") and 05a3d905
("ti_usb_3410_5052: support alternate firmware") which made the driver
use the wrong firmware for certain devices on big-endian machines.
Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Sun, 11 Aug 2013 14:49:21 +0000 (16:49 +0200)]
USB: usbtmc: fix big-endian probe of Rigol devices
Fix probe of Rigol devices on big-endian machines. A quirk for these
devices was introduced by commit c2e314835 ("USB: usbtmc: Set
rigol_quirk if device is listed") but was only enabled on little-endian
machines.
Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Sun, 11 Aug 2013 14:49:20 +0000 (16:49 +0200)]
USB: mos7840: fix big-endian probe
Fix bug in device-type detection on big-endian machines originally
introduced by commit 0eafe4de ("USB: serial: mos7840: add support for
MCS7810 devices") which always matched on little-endian product ids.
Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Matt Burtch [Mon, 12 Aug 2013 17:11:39 +0000 (10:11 -0700)]
USB-Serial: Fix error handling of usb_wwan
This fixes an issue where the bulk-in urb used for incoming data transfer
is not resubmitted if the packet recieved contains an error status. This
results in the driver locking until the port is closed and re-opened.
Tested on a custom board with a Cinterion GSM module.
Thomas Pugliese [Fri, 9 Aug 2013 14:52:13 +0000 (09:52 -0500)]
wusbcore: fix kernel panic when disconnecting a wireless USB->serial device
This patch fixes a kernel panic that can occur when disconnecting a
wireless USB->serial device. When the serial device disconnects, the
device cleanup procedure ends up calling usb_hcd_disable_endpoint on the
serial device's endpoints. The wusbcore uses the ABORT_RPIPE command to
abort all transfers on the given endpoint but it does not properly give
back the URBs when the transfer results return from the HWA. This patch
prevents the transfer result processing code from bailing out when it sees
a WA_XFER_STATUS_ABORTED result code so that these urbs are flushed
properly by usb_hcd_disable_endpoint. It also updates wa_urb_dequeue to
handle the case where the endpoint has already been cleaned up when
usb_kill_urb is called which is where the panic originally occurred.
Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alan Stern [Wed, 7 Aug 2013 14:58:05 +0000 (10:58 -0400)]
USB: EHCI: accept very late isochronous URBs
Since commits 4005ad4390bf (EHCI: implement new semantics for
URB_ISO_ASAP) and c75c5ab575af (ALSA: USB: adjust for changed 3.8 USB
API) became widely distributed, people have been experiencing problems
with audio transfers. The slightest underrun causes complete failure,
requiring the audio stream to be restarted.
It turns out that the current isochronous API doesn't handle underruns
in the best way. The ALSA developers would much rather have transfers
that are submitted too late be accepted and complete in the normal
fashion, rather than being refused outright.
This patch implements the requested approach. When an isochronous URB
submission is so late that all its scheduled slots have already
expired, a debugging message will be printed in the log and the URB
will be accepted as usual. Assuming it was submitted by a completion
handler (which is normally the case), it will complete shortly
thereafter with all the usb_iso_packet_descriptor status fields marked
-EXDEV.
Bing Zhao [Sat, 10 Aug 2013 04:09:06 +0000 (21:09 -0700)]
mwifiex: fix build error when CONFIG_PM is not set
config: make ARCH=m68k allmodconfig
All error/warnings:
drivers/net/wireless/mwifiex/cfg80211.c: In function
'mwifiex_fill_coalesce_rule_info':
>> drivers/net/wireless/mwifiex/cfg80211.c:2493:3: error: implicit
declaration of function 'mwifiex_is_pattern_supported'
[-Werror=implicit-function-declaration]
drivers/net/wireless/mwifiex/cfg80211.c: At top level:
drivers/net/wireless/mwifiex/cfg80211.c:2537:12: warning:
'mwifiex_cfg80211_set_coalesce' defined but not used
[-Wunused-function]
cc1: some warnings being treated as errors
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Theodore Ts'o [Mon, 12 Aug 2013 13:29:30 +0000 (09:29 -0400)]
ext4: flush the extent status cache during EXT4_IOC_SWAP_BOOT
Previously we weren't swapping only some of the extent_status LRU
fields during the processing of the EXT4_IOC_SWAP_BOOT ioctl. The
much safer thing to do is to just completely flush the extent status
tree when doing the swap.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <gnehzuil.liu@gmail.com> Cc: stable@vger.kernel.org
Maksim A. Boyko [Sat, 10 Aug 2013 08:20:02 +0000 (12:20 +0400)]
ALSA: usb-audio: Fix invalid volume resolution for Logitech HD Webcam C525
Add the volume control quirk for avoiding the kernel warning
for the Logitech HD Webcam C525
as in the similar commit 36691e1be6ec551eef4a5225f126a281f8c051c2
for the Logitech HD Webcam C310.
Reported-by: Maksim Boyko <maksim.a.boyko@gmail.com> Tested-by: Maksim Boyko <maksim.a.boyko@gmail.com> Cc: <stable@vger.kernel.org> # 3.10.5+ Signed-off-by: Maksim Boyko <maksim.a.boyko@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Thu, 8 Aug 2013 07:32:37 +0000 (09:32 +0200)]
ALSA: hda - Fix missing mute controls for CX5051
We've added a fake mute control (setting the amp volume to zero) for
CX5051 at commit [3868137e: ALSA: hda - Add a fake mute feature], but
this feature was overlooked in the generic parser implementation. Now
the driver lacks of mute controls on these codecs.
The fix is just to check both AC_AMPCAP_MUTE and AC_AMPCAP_MIN_MUTE
bits in each place checking the amp capabilities.
Commit aafe77cc45a5 (ALSA: usb-audio: add support for many Roland/Yamaha
devices) had several logic errors that prevented create_auto_midi_quirk
from enumerating any MIDI ports.
Reported-by: Keith A. Milner <maillist@superlative.org> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Eric Dumazet [Mon, 12 Aug 2013 04:54:48 +0000 (21:54 -0700)]
af_unix: fix bug on large send()
commit e370a723632 ("af_unix: improve STREAM behavior with fragmented
memory") added a bug on large send() because the
skb_copy_datagram_from_iovec() call always start from the beginning
of iovec.
We must instead use the @sent variable to properly skip the
already processed part.
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem is that the tipc_link_delete() will cancel the timer disc_timeout() when
the b_ptr->lock is hold, but the disc_timeout() still call b_ptr->lock to finish the
work, so the dead lock occurs.
We should unlock the b_ptr->lock when del the disc_timeout().
Remove link_timeout() still met the same problem, the patch:
fix the problem, so no need to send patch for fix link_timeout() deadlock warming.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix possibly wrong memcpy() bytes length since some CAN records received from
PCAN-USB could define a DLC field in range [9..15].
In that case, the real DLC value MUST be used to move forward the record pointer
but, only 8 bytes max. MUST be copied into the data field of the struct
can_frame object of the skb given to the network core.
Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
In the first case, macvtap_put_user() calls macvlan_count_rx()
in a preempt-able context, and this is not allowed.
In the second case, macvtap_get_user() calls
macvlan_start_xmit() with BH enabled, and this is not allowed.
Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com> Bisected-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Thomas Huth <thuth@linux.vnet.ibm.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 11 Aug 2013 23:32:26 +0000 (16:32 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is three bug fixes: An fnic warning caused by sleeping under a
lock, a major regression with our updated WRITE SAME/UNMAP logic which
caused tons of USB devices (and one RAID card) to cease to function
and a megaraid_sas firmware initialisation problem which causes kdump
failures"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] Don't attempt to send extended INQUIRY command if skip_vpd_pages is set
[SCSI] fnic: BUG: sleeping function called from invalid context during probe
[SCSI] megaraid_sas: megaraid_sas driver init fails in kdump kernel
Linus Torvalds [Sun, 11 Aug 2013 19:12:39 +0000 (12:12 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"This includes small series from Michael Neuling to fix a couple of
nasty remaining problems with the new Power8 support, also targeted at
stable 3.10, without which some new userspace accessible registers
aren't properly context switched, and in some case, can be clobbered
by the user of transactional memory.
Along with that, a few slightly more minor things, such as a missing
Kconfig option to enable handling of denorm exceptions when not
running under a hypervisor (or userspace will randomly crash when
hitting denorms with the vector unit), some nasty bugs in the new
pstore oops code, and other simple bug fixes worth having in now.
Note: I picked up the two powerpc KVM fixes as Alex Graf asked me to
handle KVM bits while he is on vacation. However I'll let him decide
whether they should go to -stable or not when he is back"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
powerpc: Save the TAR register earlier
powerpc: Fix context switch DSCR on POWER8
powerpc: Rework setting up H/FSCR bit definitions
powerpc: Fix hypervisor facility unavaliable vector number
powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
powerpc/kvm: Add signed type cast for comparation
powerpc/eeh: Add missing procfs entry for PowerNV
powerpc/pseries: Add backward compatibilty to read old kernel oops-log
powerpc/pseries: Fix buffer overflow when reading from pstore
powerpc: On POWERNV enable PPC_DENORMALISATION by default
Linus Torvalds [Sun, 11 Aug 2013 19:11:33 +0000 (12:11 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull s390 kvm fixes from Paolo Bonzini:
"Two fixes for s390"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: fix pfmf non-quiescing control handling
KVM: s390: move kvm_guest_enter,exit closer to sie
Linus Torvalds [Sun, 11 Aug 2013 19:10:47 +0000 (12:10 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Some driver bugfixes for the I2C subsystem"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mv64xxx: Document the newly introduced allwinner compatible
i2c: Fix Kontron PLD prescaler calculation
i2c: i2c-mxs: Use DMA mode even for small transfers
Linus Torvalds [Sat, 10 Aug 2013 22:21:47 +0000 (15:21 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are assorted fixes, mostly from Josef nailing down xfstests
runs. Zach also has a long standing fix for problems with readdir
wrapping f_pos (or ctx->pos)
These patches were spread out over different bases, so I rebased
things on top of rc4 and retested overnight"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: don't loop on large offsets in readdir
Btrfs: check to see if root_list is empty before adding it to dead roots
Btrfs: release both paths before logging dir/changed extents
Btrfs: allow splitting of hole em's when dropping extent cache
Btrfs: make sure the backref walker catches all refs to our extent
Btrfs: fix backref walking when we hit a compressed extent
Btrfs: do not offset physical if we're compressed
Btrfs: fix extent buffer leak after backref walking
Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified
Linus Torvalds [Sat, 10 Aug 2013 22:20:37 +0000 (15:20 -0700)]
Merge tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Stable patch for lockd to fix Oopses due to inappropriate calls to
utsname()->nodename
- Stable patches for sunrpc to fix Oopses on shutdown when using
AF_LOCAL sockets with rpcbind
- Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint
- Fix a regression with the sync mount option failing to work for nfs4
mounts
- Fix a writeback performance issue when doing cache invalidation
- Remove an incorrect call to nfs_setsecurity in nfs_fhget
* tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix up nfs4_proc_lookup_mountpoint
NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()
NFSv4: Fix the sync mount option for nfs4 mounts
NFS: Fix writeback performance issue on cache invalidation
SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
SUNRPC: Don't auto-disconnect from the local rpcbind socket
LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs
Linus Torvalds [Sat, 10 Aug 2013 22:19:58 +0000 (15:19 -0700)]
Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Some fixes for a 4.1 feature that in retrospect probably should have
waited for 3.12.... But it appears to be working now"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID
nfsd4: Fix MACH_CRED NULL dereference
Linus Lüssing [Tue, 6 Aug 2013 18:21:15 +0000 (20:21 +0200)]
batman-adv: fix potential kernel paging errors for unicast transmissions
There are several functions which might reallocate skb data. Currently
some places keep reusing their old ethhdr pointer regardless of whether
they became invalid after such a reallocation or not. This potentially
leads to kernel paging errors.
This patch fixes these by refetching the ethdr pointer after the
potential reallocations.
Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
David S. Miller [Sat, 10 Aug 2013 20:44:22 +0000 (13:44 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 10 Aug 2013 20:00:56 +0000 (13:00 -0700)]
Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A couple of USB-audio fixes that should also go to stable kernels"
* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: do not trust too-big wMaxPacketSize values
ALSA: 6fire: fix DMA issues with URB transfer_buffer usage
Yuchung Cheng [Sat, 10 Aug 2013 00:21:27 +0000 (17:21 -0700)]
netfilter: nf_conntrack: fix tcp_in_window for Fast Open
Currently the conntrack checks if the ending sequence of a packet
falls within the observed receive window. However it does so even
if it has not observe any packet from the remote yet and uses an
uninitialized receive window (td_maxwin).
If a connection uses Fast Open to send a SYN-data packet which is
dropped afterward in the network. The subsequent SYNs retransmits
will all fail this check and be discarded, leading to a connection
timeout. This is because the SYN retransmit does not contain data
payload so
end == initial sequence number (isn) + 1
sender->td_end == isn + syn_data_len
receiver->td_maxwin == 0
The fix is to only apply this check after td_maxwin is initialized.
Reported-by: Michael Chan <mcfchan@stanford.edu> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Sat, 10 Aug 2013 16:00:51 +0000 (09:00 -0700)]
Merge tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are 3 small fixes for staging/IIO drivers for 3.11-rc5. Nothing
huge, two IIO driver fixes, and a zcache fix. All of these have been
in linux-next for a while"
* tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: zcache: fix "zcache=" kernel parameter
iio: ti_am335x_adc: Fix wrong samples received on 1st read
iio:trigger: Fix use_count race condition
Linus Torvalds [Sat, 10 Aug 2013 16:00:21 +0000 (09:00 -0700)]
Merge tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are 3 small USB fixes for 3.11-rc5.
One is a fix that the ChromeOS developers ran into on some Intel
hardware, one is a build fix, and the last is a MAINTAINERS update to
help people figure out where to send USB network driver patches.
All of these have been in linux-next for a while"
* tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
MAINTAINERS: Add separate section for USB NETWORKING DRIVERS
usb: xhci: add missing dma-mapping.h includes
usb: core: don't try to reset_device() a port that got just disconnected
Eric Dumazet [Thu, 8 Aug 2013 21:38:47 +0000 (14:38 -0700)]
net: attempt high order allocations in sock_alloc_send_pskb()
Adding paged frags skbs to af_unix sockets introduced a performance
regression on large sends because of additional page allocations, even
if each skb could carry at least 100% more payload than before.
We can instruct sock_alloc_send_pskb() to attempt high order
allocations.
Most of the time, it does a single page allocation instead of 8.
I added an additional parameter to sock_alloc_send_pskb() to
let other users to opt-in for this new feature on followup patches.
Eric Dumazet [Thu, 8 Aug 2013 21:37:32 +0000 (14:37 -0700)]
af_unix: improve STREAM behavior with fragmented memory
unix_stream_sendmsg() currently uses order-2 allocations,
and we had numerous reports this can fail.
The __GFP_REPEAT flag present in sock_alloc_send_pskb() is
not helping.
This patch extends the work done in commit eb6a24816b247c
("af_unix: reduce high order page allocations) for
datagram sockets.
This opens the possibility of zero copy IO (splice() and
friends)
The trick is to not use skb_pull() anymore in recvmsg() path,
and instead add a @consumed field in UNIXCB() to track amount
of already read payload in the skb.
There is a performance regression for large sends
because of extra page allocations that will be addressed
in a follow-up patch, allowing sock_alloc_send_pskb()
to attempt high order page allocations.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Thu, 8 Aug 2013 21:06:22 +0000 (14:06 -0700)]
tcp: add server ip to encrypt cookie in fast open
Encrypt the cookie with both server and client IPv4 addresses,
such that multi-homed server will grant different cookies
based on both the source and destination IPs. No client change
is needed since cookie is opaque to the client.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zach Brown [Thu, 11 Jul 2013 23:19:42 +0000 (16:19 -0700)]
btrfs: don't loop on large offsets in readdir
When btrfs readdir() hits the last entry it sets the readdir offset to a
huge value to stop buggy apps from breaking when the same name is
returned by readdir() with concurrent rename()s.
But unconditionally setting the offset to INT_MAX causes readdir() to
loop returning any entries with offsets past INT_MAX. It only takes a
few hours of constant file creation and removal to create entries past
INT_MAX.
So let's set the huge offset to LLONG_MAX if the last entry has already
overflowed 32bit loff_t. Without large offsets behaviour is identical.
With large offsets 64bit apps will work and 32bit apps will be no more
broken than they currently are if they see large offsets.
Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Thu, 25 Jul 2013 19:11:47 +0000 (15:11 -0400)]
Btrfs: check to see if root_list is empty before adding it to dead roots
A user reported a panic when running with autodefrag and deleting snapshots.
This is because we could end up trying to add the root to the dead roots list
twice. To fix this check to see if we are empty before adding ourselves to the
dead roots list. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Mon, 22 Jul 2013 16:54:30 +0000 (12:54 -0400)]
Btrfs: release both paths before logging dir/changed extents
The ceph guys tripped over this bug where we were still holding onto the
original path that we used to copy the inode with when logging. This is based
on Chris's fix which was reported to fix the problem. We need to drop the paths
in two cases anyway so just move the drop up so that we don't have duplicate
code. Thanks,
Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Thu, 11 Jul 2013 14:34:59 +0000 (10:34 -0400)]
Btrfs: allow splitting of hole em's when dropping extent cache
I noticed while running multi-threaded fsync tests that sometimes fsck would
complain about an improper gap. This happens because we fail to add a hole
extent to the file, which was happening when we'd split a hole EM because
btrfs_drop_extent_cache was just discarding the whole em instead of splitting
it. So this patch fixes this by allowing us to split a hole em properly, which
means that added holes actually get logged properly and we no longer see this
fsck error. Thankfully we're tolerant of these sort of problems so a user would
not see any adverse effects of this bug, other than fsck complaining. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 18:03:47 +0000 (14:03 -0400)]
Btrfs: make sure the backref walker catches all refs to our extent
Because we don't mess with the offset into the extent for compressed we will
properly find both extents for this case
[extent a][extent b][rest of extent a]
but because we already added a ref for the front half we won't add the inode
information for the second half. This causes us to leak that memory and not
print out the other offset when we do logical-resolve. So fix this by calling
ulist_add_merge and then add our eie to the existing entry if there is one.
With this patch we get both offsets out of logical-resolve. With this and the
other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo
set. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 17:58:19 +0000 (13:58 -0400)]
Btrfs: fix backref walking when we hit a compressed extent
If you do btrfs inspect-internal logical-resolve on a compressed extent that has
been partly overwritten it won't find anything. This is because we try and
match the extent offset we've searched for based on the extent offset in the
data extent entry. However this doesn't work for compressed extents because the
offsets are for the uncompressed size, not the compressed size. So instead only
do this check if we are not compressed, that way we can get an actual entry for
the physical offset rather than nothing for compressed. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 5 Jul 2013 17:52:51 +0000 (13:52 -0400)]
Btrfs: do not offset physical if we're compressed
xfstest btrfs/276 was freaking out on slower boxes partly because fiemap was
offsetting the physical based on the extent offset. This is perfectly fine with
uncompressed extents, however the extent offset is into the uncompressed area,
not the compressed. So we can return a physical value that isn't at all within
the area we have allocated on disk. Fix this by returning the start of the
extent if it is compressed no matter what the offset. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Wed, 3 Jul 2013 06:40:44 +0000 (14:40 +0800)]
Btrfs: fix extent buffer leak after backref walking
commit 47fb091fb787420cd195e66f162737401cce023f(Btrfs: fix unlock after free on rewinded tree blocks)
takes an extra increment on the reference of allocated dummy extent buffer, so now we
cannot free this dummy one, and end up with extent buffer leak.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Liu Bo [Mon, 1 Jul 2013 14:13:26 +0000 (22:13 +0800)]
Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
For partial extents, snapshot-aware defrag does not work as expected,
since
a) we use the wrong logical offset to search for parents, which should be
disk_bytenr + extent_offset, not just disk_bytenr,
b) 'offset' returned by the backref walking just refers to key.offset, not
the 'offset' stored in btrfs_extent_data_ref which is
(key.offset - extent_offset).
The reproducer:
$ mkfs.btrfs sda
$ mount sda /mnt
$ btrfs sub create /mnt/sub
$ for i in `seq 5 -1 1`; do dd if=/dev/zero of=/mnt/sub/foo bs=5k count=1 seek=$i conv=notrunc oflag=sync; done
$ btrfs sub snap /mnt/sub /mnt/snap1
$ btrfs sub snap /mnt/sub /mnt/snap2
$ sync; btrfs filesystem defrag /mnt/sub/foo;
$ umount /mnt
$ btrfs-debug-tree sda (Here we can check whether the defrag operation is snapshot-awared.
This addresses the above two problems.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>