1. 01 Nov, 2017 1 commit
  2. 09 Feb, 2016 2 commits
    • Dave Chinner's avatar
      xfs: remove timestamps from incore inode · 3987848c
      Dave Chinner authored
      The struct xfs_inode has two copies of the current timestamps in it,
      one in the vfs inode and one in the struct xfs_icdinode. Now that we
      no longer log the struct xfs_icdinode directly, we don't need to
      keep the timestamps in this structure. instead we can copy them
      straight out of the VFS inode when formatting the inode log item or
      the on-disk inode.
      
      This reduces the struct xfs_inode in size by 24 bytes.
      Signed-off-by: 's avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: 's avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarDave Chinner <david@fromorbit.com>
      3987848c
    • Dave Chinner's avatar
      xfs: introduce inode log format object · f8d55aa0
      Dave Chinner authored
      We currently carry around and log an entire inode core in the
      struct xfs_inode. A lot of the information in the inode core is
      duplicated in the VFS inode, but we cannot remove this duplication
      of infomration because the inode core is logged directly in
      xfs_inode_item_format().
      
      Add a new function xfs_inode_item_format_core() that copies the
      inode core data into a struct xfs_icdinode that is pulled directly
      from the log vector buffer. This means we no longer directly
      copy the inode core, but copy the structures one member at a time.
      This will be slightly less efficient than copying, but will allow us
      to remove duplicate and unnecessary items from the struct xfs_inode.
      
      To enable us to do this, call the new structure a xfs_log_dinode,
      so that we know it's different to the physical xfs_dinode and the
      in-core xfs_icdinode.
      Signed-off-by: 's avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: 's avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarDave Chinner <david@fromorbit.com>
      f8d55aa0
  3. 03 Nov, 2015 1 commit
    • Dave Chinner's avatar
      xfs: optimise away log forces on timestamp updates for fdatasync · fc0561ce
      Dave Chinner authored
      xfs: timestamp updates cause excessive fdatasync log traffic
      
      Sage Weil reported that a ceph test workload was writing to the
      log on every fdatasync during an overwrite workload. Event tracing
      showed that the only metadata modification being made was the
      timestamp updates during the write(2) syscall, but fdatasync(2)
      is supposed to ignore them. The key observation was that the
      transactions in the log all looked like this:
      
      INODE: #regs: 4   ino: 0x8b  flags: 0x45   dsize: 32
      
      And contained a flags field of 0x45 or 0x85, and had data and
      attribute forks following the inode core. This means that the
      timestamp updates were triggering dirty relogging of previously
      logged parts of the inode that hadn't yet been flushed back to
      disk.
      
      There are two parts to this problem. The first is that XFS relogs
      dirty regions in subsequent transactions, so it carries around the
      fields that have been dirtied since the last time the inode was
      written back to disk, not since the last time the inode was forced
      into the log.
      
      The second part is that on v5 filesystems, the inode change count
      update during inode dirtying also sets the XFS_ILOG_CORE flag, so
      on v5 filesystems this makes a timestamp update dirty the entire
      inode.
      
      As a result when fdatasync is run, it looks at the dirty fields in
      the inode, and sees more than just the timestamp flag, even though
      the only metadata change since the last fdatasync was just the
      timestamps. Hence we force the log on every subsequent fdatasync
      even though it is not needed.
      
      To fix this, add a new field to the inode log item that tracks
      changes since the last time fsync/fdatasync forced the log to flush
      the changes to the journal. This flag is updated when we dirty the
      inode, but we do it before updating the change count so it does not
      carry the "core dirty" flag from timestamp updates. The fields are
      zeroed when the inode is marked clean (due to writeback/freeing) or
      when an fsync/datasync forces the log. Hence if we only dirty the
      timestamps on the inode between fsync/fdatasync calls, the fdatasync
      will not trigger another log force.
      
      Over 100 runs of the test program:
      
      Ext4 baseline:
      	runtime: 1.63s +/- 0.24s
      	avg lat: 1.59ms +/- 0.24ms
      	iops: ~2000
      
      XFS, vanilla kernel:
              runtime: 2.45s +/- 0.18s
      	avg lat: 2.39ms +/- 0.18ms
      	log forces: ~400/s
      	iops: ~1000
      
      XFS, patched kernel:
              runtime: 1.49s +/- 0.26s
      	avg lat: 1.46ms +/- 0.25ms
      	log forces: ~30/s
      	iops: ~1500
      Reported-by: 's avatarSage Weil <sage@redhat.com>
      Signed-off-by: 's avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: 's avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: 's avatarDave Chinner <david@fromorbit.com>
      fc0561ce
  4. 13 Dec, 2013 2 commits
  5. 12 Aug, 2013 1 commit
  6. 17 Dec, 2012 1 commit
  7. 14 May, 2012 1 commit
  8. 13 Mar, 2012 3 commits
  9. 26 Jul, 2010 2 commits
  10. 01 Feb, 2010 1 commit
    • Dave Chinner's avatar
      xfs: Don't issue buffer IO direct from AIL push V2 · d808f617
      Dave Chinner authored
      All buffers logged into the AIL are marked as delayed write.
      When the AIL needs to push the buffer out, it issues an async write of the
      buffer. This means that IO patterns are dependent on the order of
      buffers in the AIL.
      
      Instead of flushing the buffer, promote the buffer in the delayed
      write list so that the next time the xfsbufd is run the buffer will
      be flushed by the xfsbufd. Return the state to the xfsaild that the
      buffer was promoted so that the xfsaild knows that it needs to cause
      the xfsbufd to run to flush the buffers that were promoted.
      
      Using the xfsbufd for issuing the IO allows us to dispatch all
      buffer IO from the one queue. This means that we can make much more
      enlightened decisions on what order to flush buffers to disk as
      we don't have multiple places issuing IO. Optimisations to xfsbufd
      will be in a future patch.
      
      Version 2
      - kill XFS_ITEM_FLUSHING as it is now unused.
      Signed-off-by: 's avatarDave Chinner <david@fromorbit.com>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      d808f617
  11. 16 Dec, 2009 1 commit
  12. 01 Sep, 2009 1 commit
    • Christoph Hellwig's avatar
      xfs: simplify xfs_trans_iget · aa72a5cf
      Christoph Hellwig authored
      xfs_trans_iget is a wrapper for xfs_iget that adds the inode to the
      transaction after it is read.  Except when the inode already is in the
      inode cache, in which case it returns the existing locked inode with
      increment lock recursion counts.
      
      Now, no one in the tree every decrements these lock recursion counts,
      so any user of this gets a potential double unlock when both the original
      owner of the inode and the xfs_trans_iget caller unlock it.  When looking
      back in a git bisect in the historic XFS tree there was only one place
      that decremented these counts, xfs_trans_iput.  Introduced in commit
      ca25df7a840f426eb566d52667b6950b92bb84b5 by Adam Sweeney in 1993,
      and removed in commit 19f899a3ab155ff6a49c0c79b06f2f61059afaf3 by
      Steve Lord in 2003.  And as long as it didn't slip through git bisects
      cracks never actually used in that time frame.
      
      A quick audit of the callers of xfs_trans_iget shows that no caller
      really relies on this behaviour fortunately - xfs_ialloc allows this
      inode from disk so it must not be there before, and all the RT allocator
      routines only every add each RT bitmap inode once.
      
      In addition to removing lots of code and reducing the size of the inode
      item this patch also avoids the double inode cache lookup in each
      create/mkdir/mknod transaction.
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: 's avatarAlex Elder <aelder@sgi.com>
      Signed-off-by: 's avatarFelix Blyakher <felixb@sgi.com>
      aa72a5cf
  13. 22 Jan, 2009 1 commit
  14. 19 Jan, 2009 1 commit
  15. 16 Jan, 2009 1 commit
  16. 30 Oct, 2008 1 commit
  17. 18 Apr, 2008 1 commit
  18. 28 Sep, 2006 1 commit
  19. 09 Jun, 2006 1 commit
  20. 02 Nov, 2005 2 commits
  21. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4