]> Pileus Git - ~andy/linux/commit
dlm: fix deadlock between dlm_send and dlm_controld
authorDavid Teigland <teigland@redhat.com>
Thu, 26 Jul 2012 17:44:30 +0000 (12:44 -0500)
committerDavid Teigland <teigland@redhat.com>
Wed, 8 Aug 2012 16:33:35 +0000 (11:33 -0500)
commit36b71a8bfbc92e1ba164e9aec840c0180ee933b5
tree6cda578ba4e7e4c9de16e333ffdc032717348b81
parent42a579a0f960081cd16fc945036e4780c3ad3202
dlm: fix deadlock between dlm_send and dlm_controld

A deadlock sometimes occurs between dlm_controld closing
a lowcomms connection through configfs and dlm_send looking
up the address for a new connection in configfs.

dlm_controld does a configfs rmdir which calls
dlm_lowcomms_close which waits for dlm_send to
cancel work on the workqueues.

The dlm_send workqueue thread has called
tcp_connect_to_sock which calls dlm_nodeid_to_addr
which does a configfs lookup and blocks on a lock
held by dlm_controld in the rmdir path.

The solution here is to save the node addresses within
the lowcomms code so that the lowcomms workqueue does
not need to step through configfs to get a node address.

dlm_controld:
wait_for_completion+0x1d/0x20
__cancel_work_timer+0x1b3/0x1e0
cancel_work_sync+0x10/0x20
dlm_lowcomms_close+0x4c/0xb0 [dlm]
drop_comm+0x22/0x60 [dlm]
client_drop_item+0x26/0x50 [configfs]
configfs_rmdir+0x180/0x230 [configfs]
vfs_rmdir+0xbd/0xf0
do_rmdir+0x103/0x120
sys_rmdir+0x16/0x20

dlm_send:
mutex_lock+0x2b/0x50
get_comm+0x34/0x140 [dlm]
dlm_nodeid_to_addr+0x18/0xd0 [dlm]
tcp_connect_to_sock+0xf4/0x2d0 [dlm]
process_send_sockets+0x1d2/0x260 [dlm]
worker_thread+0x170/0x2a0

Signed-off-by: David Teigland <teigland@redhat.com>
fs/dlm/config.c
fs/dlm/config.h
fs/dlm/lowcomms.c
fs/dlm/lowcomms.h
fs/dlm/main.c