Skip to content
  • Darrick J. Wong's avatar
    xfs: ratelimit inode flush on buffered write ENOSPC · c6425702
    Darrick J. Wong authored
    
    
    A customer reported rcu stalls and softlockup warnings on a computer
    with many CPU cores and many many more IO threads trying to write to a
    filesystem that is totally out of space.  Subsequent analysis pointed to
    the many many IO threads calling xfs_flush_inodes -> sync_inodes_sb,
    which causes a lot of wb_writeback_work to be queued.  The writeback
    worker spends so much time trying to wake the many many threads waiting
    for writeback completion that it trips the softlockup detector, and (in
    this case) the system automatically reboots.
    
    In addition, they complain that the lengthy xfs_flush_inodes scan traps
    all of those threads in uninterruptible sleep, which hampers their
    ability to kill the program or do anything else to escape the situation.
    
    If there's thousands of threads trying to write to files on a full
    filesystem, each of those threads will start separate copies of the
    inode flush scan.  This is kind of pointless since we only need one
    scan, so rate limit the inode flush.
    
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    c6425702