Skip to content
  • Roman Gushchin's avatar
    writeback, cgroup: remove wb from offline list before releasing refcnt · b43a9e76
    Roman Gushchin authored
    Boyang reported that the commit c22d70a1 ("writeback, cgroup:
    release dying cgwbs by switching attached inodes") causes the kernel to
    crash while running xfstests generic/256 on ext4 on aarch64 and ppc64le.
    
      run fstests generic/256 at 2021-07-12 05:41:40
      EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: . Quota mode: none.
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      Mem abort info:
         ESR = 0x96000005
         EC = 0x25: DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
         FSC = 0x05: level 1 translation fault
      Data abort info:
         ISV = 0, ISS = 0x00000005
         CM = 0, WnR = 0
      user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
      [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
      Internal error: Oops: 96000005 [#1] SMP
      Modules linked in: dm_flakey dm_snapshot dm_bufio dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_blk virtio_net net_failover virtio_console failover virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
      CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
      Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      Workqueue: events_unbound cleanup_offline_cgwbs_workfn
      pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
      pc : cleanup_offline_cgwbs_workfn+0x320/0x394
      lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
      sp : ffff80001554fd10
      x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
      x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
      x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
      x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
      x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
      x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
      x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
      x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
      x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
      x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
      Call trace:
         cleanup_offline_cgwbs_workfn+0x320/0x394
         process_one_work+0x1f4/0x4b0
         worker_thread+0x184/0x540
         kthread+0x114/0x120
         ret_from_fork+0x10/0x18
      Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
      ---[ end trace e250fe289272792a ]---
      Kernel panic - not syncing: Oops: Fatal exception
      SMP: stopping secondary CPUs
      SMP: failed to stop secondary CPUs 0-2
      Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
      PHYS_OFFSET: 0xfff0defca0000000
      CPU features: 0x00200251,23200840
      Memory Limit: none
      ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
    
    The problem happens when cgwb_release_workfn() races with
    cleanup_offline_cgwbs_workfn(): wb_tryget() in
    cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit() is
    cgwb_release_workfn(), which is basically a use-after-free error.
    
    Fix the problem by making removing the writeback structure from the
    offline list before releasing the percpu reference counter.  It will
    guarantee that cleanup_offline_cgwbs_workfn() will not see and not
    access writeback structures which are about to be released.
    
    Link: https://lkml.kernel.org/r/20210716201039.3762203-1-guro@fb.com
    Fixes: c22d70a1
    
     ("writeback, cgroup: release dying cgwbs by switching attached inodes")
    Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
    Reported-by: default avatarBoyang Xue <bxue@redhat.com>
    Suggested-by: default avatarJan Kara <jack@suse.cz>
    Tested-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: Dave Chinner <dchinner@redhat.com>
    Cc: Murphy Zhou <jencce.kernel@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    b43a9e76