Skip to content
  • Rakesh Pandit's avatar
    nvme-pci: fix multiple ctrl removal scheduling · 82b057ca
    Rakesh Pandit authored
    Commit c5f6ce97 tries to address multiple resets but fails as
    work_busy doesn't involve any synchronization and can fail.  This is
    reproducible easily as can be seen by WARNING below which is triggered
    with line:
    
    WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING)
    
    Allowing multiple resets can result in multiple controller removal as
    well if different conditions inside nvme_reset_work fail and which
    might deadlock on device_release_driver.
    
    [  480.327007] WARNING: CPU: 3 PID: 150 at drivers/nvme/host/pci.c:1900 nvme_reset_work+0x36c/0xec0
    [  480.327008] Modules linked in: rfcomm fuse nf_conntrack_netbios_ns nf_conntrack_broadcast...
    [  480.327044]  btusb videobuf2_core ghash_clmulni_intel snd_hwdep cfg80211 acer_wmi hci_uart..
    [  480.327065] CPU: 3 PID: 150 Comm: kworker/u16:2 Not tainted 4.12.0-rc1+ #13
    [  480.327065] Hardware name: Acer Predator G9-591/Mustang_SLS, BIOS V1.10 03/03/2016
    [  480.327066] Workqueue: nvme nvme_reset_work
    [  480.327067] task: ffff880498ad8000 task.stack: ffffc90002218000
    [  480.327068] RIP: 0010:nvme_reset_work+0x36c/0xec0
    [  480.327069] RSP: 0018:ffffc9000221bdb8 EFLAGS: 00010246
    [  480.327070] RAX: 0000000000460000 RBX: ffff880498a98128 RCX: dead000000000200
    [  480.327070] RDX: 0000000000000001 RSI: ffff8804b1028020 RDI: ffff880498a98128
    [  480.327071] RBP: ffffc9000221be50 R08: 0000000000000000 R09: 0000000000000000
    [  480.327071] R10: ffffc90001963ce8 R11: 000000000000020d R12: ffff880498a98000
    [  480.327072] R13: ffff880498a53500 R14: ffff880498a98130 R15: ffff880498a98128
    [  480.327072] FS:  0000000000000000(0000) GS:ffff8804c1cc0000(0000) knlGS:0000000000000000
    [  480.327073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  480.327074] CR2: 00007ffcf3c37f78 CR3: 0000000001e09000 CR4: 00000000003406e0
    [  480.327074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  480.327075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  480.327075] Call Trace:
    [  480.327079]  ? __switch_to+0x227/0x400
    [  480.327081]  process_one_work+0x18c/0x3a0
    [  480.327082]  worker_thread+0x4e/0x3b0
    [  480.327084]  kthread+0x109/0x140
    [  480.327085]  ? process_one_work+0x3a0/0x3a0
    [  480.327087]  ? kthread_park+0x60/0x60
    [  480.327102]  ret_from_fork+0x2c/0x40
    [  480.327103] Code: e8 5a dc ff ff 85 c0 41 89 c1 0f.....
    
    This patch addresses the problem by using state of controller to
    decide whether reset should be queued or not as state change is
    synchronizated using controller spinlock.  Also cancel_work_sync is
    used to make sure remove cancels the reset_work and waits for it to
    finish.  This patch also changes return value from -ENODEV to more
    appropriate -EBUSY if nvme_reset fails to change state.
    
    Fixes: c5f6ce97
    
     ("nvme: don't schedule multiple resets")
    Signed-off-by: default avatarRakesh Pandit <rakesh@tuxera.com>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    82b057ca