1. 27 Jun, 2016 1 commit
    • Andreas Gruenbacher's avatar
      gfs2: Get rid of gfs2_ilookup · ec5ec66b
      Andreas Gruenbacher authored
      Now that gfs2_lookup_by_inum only takes the inode glock for new inodes
      (and not for cached inodes anymore), there no longer is a need to
      optimize the cached-inode case in gfs2_get_dentry or delete_work_func,
      and gfs2_ilookup can be removed.
      In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in
      i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this
      inconsistency goes away as well.
      Signed-off-by: 's avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: 's avatarBob Peterson <rpeterso@redhat.com>
  2. 15 Mar, 2016 1 commit
    • Bob Peterson's avatar
      GFS2: Don't filter out I_FREEING inodes anymore · ff34245d
      Bob Peterson authored
      This patch basically reverts a very old patch from 2008,
      7a9f53b3, with the title
      "Alternate gfs2_iget to avoid looking up inodes being freed".
      The original patch was designed to avoid a deadlock caused by lock
      ordering with try_rgrp_unlink. The patch forced the function to not
      find inodes that were being removed by VFS. The problem is, that
      made it impossible for nodes to delete their own unlinked dinodes
      after a certain point in time, because the inode needed was not found
      by this filtering process. There is no longer a need for the patch,
      since function try_rgrp_unlink no longer locks the inode: All it does
      is queue the glock onto the delete work_queue, so there should be no
      more deadlock.
      Signed-off-by: 's avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  3. 15 Apr, 2015 1 commit
  4. 31 Oct, 2014 1 commit
  5. 29 Jun, 2013 2 commits
  6. 26 Feb, 2013 1 commit
  7. 10 Oct, 2012 1 commit
    • Hugh Dickins's avatar
      tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking · 35c2a7f4
      Hugh Dickins authored
      Fuzzing with trinity oopsed on the 1st instruction of shmem_fh_to_dentry(),
      	u64 inum = fid->raw[2];
      which is unhelpfully reported as at the end of shmem_alloc_inode():
      BUG: unable to handle kernel paging request at ffff880061cd3000
      IP: [<ffffffff812190d0>] shmem_alloc_inode+0x40/0x40
      Call Trace:
       [<ffffffff81488649>] ? exportfs_decode_fh+0x79/0x2d0
       [<ffffffff812d77c3>] do_handle_open+0x163/0x2c0
       [<ffffffff812d792c>] sys_open_by_handle_at+0xc/0x10
       [<ffffffff83a5f3f8>] tracesys+0xe1/0xe6
      Right, tmpfs is being stupid to access fid->raw[2] before validating that
      fh_len includes it: the buffer kmalloc'ed by do_sys_name_to_handle() may
      fall at the end of a page, and the next page not be present.
      But some other filesystems (ceph, gfs2, isofs, reiserfs, xfs) are being
      careless about fh_len too, in fh_to_dentry() and/or fh_to_parent(), and
      could oops in the same way: add the missing fh_len checks to those.
      Reported-by: 's avatarSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: 's avatarHugh Dickins <hughd@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
  8. 30 May, 2012 1 commit
  9. 08 Nov, 2011 1 commit
  10. 20 Apr, 2011 1 commit
    • Steven Whitehouse's avatar
      GFS2: Make writeback more responsive to system conditions · 4667a0ec
      Steven Whitehouse authored
      This patch adds writeback_control to writing back the AIL
      list. This means that we can then take advantage of the
      information we get in ->write_inode() in order to set off
      some pre-emptive writeback.
      In addition, the AIL code is cleaned up a bit to make it
      a bit simpler to understand.
      There is still more which can usefully be done in this area,
      but this is a good start at least.
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  11. 14 Mar, 2011 1 commit
  12. 13 Jan, 2011 1 commit
  13. 07 Jan, 2011 1 commit
    • Nick Piggin's avatar
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin authored
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      Patched with:
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: 's avatarNick Piggin <npiggin@kernel.dk>
  14. 15 Nov, 2010 1 commit
    • Steven Whitehouse's avatar
      GFS2: Fix inode deallocation race · 044b9414
      Steven Whitehouse authored
      This area of the code has always been a bit delicate due to the
      subtleties of lock ordering. The problem is that for "normal"
      alloc/dealloc, we always grab the inode locks first and the rgrp lock
      In order to ensure no races in looking up the unlinked, but still
      allocated inodes, we need to hold the rgrp lock when we do the lookup,
      which means that we can't take the inode glock.
      The solution is to borrow the technique already used by NFS to solve
      what is essentially the same problem (given an inode number, look up
      the inode carefully, checking that it really is in the expected
      We cannot do that directly from the allocation code (lock ordering
      again) so we give the job to the pre-existing delete workqueue and
      carry on with the allocation as normal.
      If we find there is no space, we do a journal flush (required anyway
      if space from a deallocation is to be released) which should block
      against the pending deallocations, so we should always get the space
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  15. 20 Sep, 2010 1 commit
  16. 14 Apr, 2010 1 commit
    • Bob Peterson's avatar
      GFS2: glock livelock · 1a0eae88
      Bob Peterson authored
      This patch fixes a couple gfs2 problems with the reclaiming of
      unlinked dinodes.  First, there were a couple of livelocks where
      everything would come to a halt waiting for a glock that was
      seemingly held by a process that no longer existed.  In fact, the
      process did exist, it just had the wrong pid number in the holder
      information.  Second, there was a lock ordering problem between
      inode locking and glock locking.  Third, glock/inode contention
      could sometimes cause inodes to be improperly marked invalid by
      Signed-off-by: 's avatarBob Peterson <rpeterso@redhat.com>
  17. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      The script does the followings.
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
      The conversion was done in the following steps.
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
      6. percpu.h was updated not to include slab.h.
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: 's avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
  18. 08 Sep, 2009 1 commit
    • Steven Whitehouse's avatar
      GFS2: Be extra careful about deallocating inodes · acf7e244
      Steven Whitehouse authored
      There is a potential race in the inode deallocation code if two
      nodes try to deallocate the same inode at the same time. Most of
      the issue is solved by the iopen locking. There is still a small
      window which is not covered by the iopen lock. This patches fixes
      that and also makes the deallocation code more robust in the face of
      any errors in the rgrp bitmaps, or erroneous iopen callbacks from
      other nodes.
      This does introduce one extra disk read, but that is generally not
      an issue since its the same block that must be written to later
      in the deallocation process. The total disk accesses therefore stay
      the same,
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  19. 22 May, 2009 1 commit
    • Steven Whitehouse's avatar
      GFS2: Clean up some file names · b1e71b06
      Steven Whitehouse authored
      This patch renames the ops_*.c files which have no counterpart
      without the ops_ prefix in order to shorten the name and make
      it more readable. In addition, ops_address.h (which was very
      small) is moved into inode.h and inode.h is cleaned up by
      adding extern where required.
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  20. 24 Mar, 2009 1 commit
    • Steven Whitehouse's avatar
      GFS2: Merge lock_dlm module into GFS2 · f057f6cd
      Steven Whitehouse authored
      This is the big patch that I've been working on for some time
      now. There are many reasons for wanting to make this change
      such as:
       o Reducing overhead by eliminating duplicated fields between structures
       o Simplifcation of the code (reduces the code size by a fair bit)
       o The locking interface is now the DLM interface itself as proposed
         some time ago.
       o Fewer lookups of glocks when processing replies from the DLM
       o Fewer memory allocations/deallocations for each glock
       o Scope to do further optimisations in the future (but this patch is
         more than big enough for now!)
      Please note that (a) this patch relates to the lock_dlm module and
      not the DLM itself, that is still a separate module; and (b) that
      we retain the ability to build GFS2 as a standalone single node
      filesystem with out requiring the DLM.
      This patch needs a lot of testing, hence my keeping it I restarted
      my -git tree after the last merge window. That way, this has the maximum
      exposure before its merged. This is (modulo a few minor bug fixes) the
      same patch that I've been posting on and off the the last three months
      and its passed a number of different tests so far.
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  21. 05 Jan, 2009 2 commits
  22. 23 Oct, 2008 1 commit
  23. 27 Jul, 2008 1 commit
  24. 31 Mar, 2008 1 commit
    • Julia Lawall's avatar
      [GFS2] test for IS_ERR rather than 0 · 773adff8
      Julia Lawall authored
      The function gfs2_inode_lookup always returns either a valid pointer or a
      value made with ERR_PTR, so its result should be tested with IS_ERR, not
      with a test for 0.
      The problem was found using the following semantic match.
      expression E, E1;
      statement S,S1;
      position p;
      E = gfs2_inode_lookup(...)
      ... when != E = E1
      if@p (E) S else S1
      position a.p;
      expression E,E1;
      statement S,S1;
      E = NULL
      ... when != E = E1
      if@p (E) S else S1
      @depends on !n@
      expression E;
      statement S,S1;
      position a.p;
      * if@p (E)
        S else S1
      Signed-off-by: 's avatarJulia Lawall <julia@diku.dk>
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  25. 07 Feb, 2008 1 commit
  26. 22 Oct, 2007 2 commits
  27. 10 Oct, 2007 1 commit
    • Benjamin Marzinski's avatar
      [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed · 7a9f53b3
      Benjamin Marzinski authored
      There is a possible deadlock between two processes on the same node, where one
      process is deleting an inode, and another process is looking for allocated but
      unused inodes to delete in order to create more space.
      process A does an iput() on inode X, and it's i_count drops to 0. This causes
      iput_final() to be called, which puts an inode into state I_FREEING at
      generic_delete_inode(). There no point between when iput_final() is called, and
      when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
      set, no other process on that node can successfully look up that inode until
      the delete finishes.
      process B locks the the resource group for the same inode in get_local_rgrp(),
      which is called by gfs2_inplace_reserve_i()
      process A tries to lock the resource group for the inode in
      gfs2_dinode_dealloc(), but it's already locked by process B
      process B waits in find_inode for the inode to have the I_FREEING state cleared.
      This patch solves the problem by adding an alternative to gfs2_iget(),
      gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
      state.o The alternate test function is just like the original one, except that
      it fails if the inode is being freed, and sets a skipped flag. The alternate
      set function is just like the original, except that it fails if the skipped
      flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
      Signed-off-by: 's avatarBenjamin E. Marzinski <bmarzins@redhat.com>
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  28. 17 Jul, 2007 1 commit
  29. 10 Jul, 2007 1 commit
    • Steven Whitehouse's avatar
      [GFS2] Accept old format NFS filehandles · 3ebf4490
      Steven Whitehouse authored
      On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote:
      > > -#define GFS2_LARGE_FH_SIZE 10
      > > -
      > > -struct gfs2_fh_obj {
      > > -   struct gfs2_inum_host this;
      > > -   u32 imode;
      > > -};
      > > +#define GFS2_LARGE_FH_SIZE 8
      > Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE
      > or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older
      > gfs version anymore.  Stale filehandles because of a new kernel version
      > are a big no-no, so please add back code to handle the old filehandles
      > on the decode side.
      This should fix that problem I think since its only relating to end of
      the fh we can just ignore that field in order to accept the older
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Wendy Cheng <wcheng@redhat.com>
  30. 09 Jul, 2007 4 commits
    • Wendy Cheng's avatar
      [GFS2] Remove i_mode passing from NFS File Handle · 35dcc52e
      Wendy Cheng authored
      GFS2 has been passing i_mode within NFS File Handle. Other than the
      wrong assumption that there is always room for this extra 16 bit value,
      the current gfs2_get_dentry doesn't really need the i_mode to work
      correctly. Note that GFS2 NFS code does go thru the same lookup code
      path as direct file access route (where the mode is obtained from name
      lookup) but gfs2_get_dentry() is coded for different purpose. It is not
      used during lookup time. It is part of the file access procedure call.
      When the call is invoked, if on-disk inode is not in-memory, it has to
      be read-in. This makes i_mode passing a useless overhead.
      Signed-off-by: 's avatarS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
    • Wendy Cheng's avatar
      [GFS2] Obtaining no_formal_ino from directory entry · bb9bcf06
      Wendy Cheng authored
      GFS2 lookup code doesn't ask for inode shared glock. This implies during
      in-memory inode creation for existing file, GFS2 will not disk-read in
      the inode contents. This leaves no_formal_ino un-initialized during
      lookup time. The un-initialized no_formal_ino is subsequently encoded
      into file handle. Clients will get ESTALE error whenever it tries to
      access these files.
      Signed-off-by: 's avatarS. Wendy Cheng <wcheng@redhat.com>
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
    • Steven Whitehouse's avatar
      [GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f
      Steven Whitehouse authored
      This patch fixes some sign issues which were accidentally introduced
      into the quota & statfs code during the endianess annotation process.
      Also included is a general clean up which moves all of the _host
      structures out of gfs2_ondisk.h (where they should not have been to
      start with) and into the places where they are actually used (often only
      one place). Also those _host structures which are not required any more
      are removed entirely (which is the eventual plan for all of them).
      The conversion routines from ondisk.c are also moved into the places
      where they are actually used, which for almost every one, was just one
      single place, so all those are now static functions. This also cleans up
      the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.
      The net result is a reduction of about 100 lines of code, many functions
      now marked static plus the bug fixes as mentioned above. For good
      measure I ran the code through sparse after making these changes to
      check that there are no warnings generated.
      This fixes Red Hat bz #239686
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
    • Steven Whitehouse's avatar
      [GFS2] Clean up inode number handling · dbb7cae2
      Steven Whitehouse authored
      This patch cleans up the inode number handling code. The main difference
      is that instead of looking up the inodes using a struct gfs2_inum_host
      we now use just the no_addr member of this structure. The tests relating
      to no_formal_ino can then be done by the calling code. This has
      advantages in that we want to do different things in different code
      paths if the no_formal_ino doesn't match. In the NFS patch we want to
      return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
      no_formal_ino doesn't match and thus we can withdraw in this case.
      In order to later fix bz #201012, we need to be able to look up an inode
      without knowing no_formal_ino, as the only information that is known to
      us is the on-disk location of the inode in question.
      This patch will also help us to fix bz #236099 at a later date by
      cleaning up a lot of the code in that area.
      There are no user visible changes as a result of this patch and there
      are no changes to the on-disk format either.
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
  31. 07 Mar, 2007 1 commit
  32. 14 Feb, 2007 1 commit
    • Tim Schmielau's avatar
      [PATCH] remove many unneeded #includes of sched.h · cd354f1a
      Tim Schmielau authored
      After Al Viro (finally) succeeded in removing the sched.h #include in module.h
      recently, it makes sense again to remove other superfluous sched.h includes.
      There are quite a lot of files which include it but don't actually need
      anything defined in there.  Presumably these includes were once needed for
      macros that used to live in sched.h, but moved to other header files in the
      course of cleaning it up.
      To ease the pain, this time I did not fiddle with any header files and only
      removed #includes from .c-files, which tend to cause less trouble.
      Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
      arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
      allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
      configs in arch/arm/configs on arm.  I also checked that no new warnings were
      introduced by the patch (actually, some warnings are removed that were emitted
      by unnecessarily included header files).
      Signed-off-by: 's avatarTim Schmielau <tim@physik3.uni-rostock.de>
      Acked-by: 's avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
  33. 05 Feb, 2007 2 commits
    • Steven Whitehouse's avatar
      [GFS2] Remove local exclusive glock mode · 1c0f4872
      Steven Whitehouse authored
      Here is a patch for GFS2 to remove the local exclusive flag. In
      the places it was used, mutex's are always held earlier in the
      call path, so it appears redundant in the LM_ST_SHARED case.
      Also, the GFS2 holders were setting local exclusive in any case where
      the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock
      code where the flag was tested have been replaced with tests for the
      lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the
      same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well
      as globally exclusive).
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>
    • Steven Whitehouse's avatar
      [GFS2] Clean up/speed up readdir · 3699e3a4
      Steven Whitehouse authored
      This removes the extra filldir callback which gfs2 was using to
      enclose an attempt at readahead for inodes during readdir. The
      code was too complicated and also hurts performance badly in the
      case that the getdents64/readdir call isn't being followed by
      stat() and it wasn't even getting it right all the time when it
      As a result, on my test box an "ls" of a directory containing 250000
      files fell from about 7mins (freshly mounted, so nothing cached) to
      between about 15 to 25 seconds. When the directory content was cached,
      the time taken fell from about 3mins to about 4 or 5 seconds.
      Interestingly in the cached case, running "ls -l" once reduced the time
      taken for subsequent runs of "ls" to about 6 secs even without this
      patch. Now it turns out that there was a special case of glocks being
      used for prefetching the metadata, but because of the timeouts for these
      locks (set to 10 secs) the metadata was being timed out before it was
      being used and this the prefetch code was constantly trying to prefetch
      the same data over and over.
      Calling "ls -l" meant that the inodes were brought into memory and once
      the inodes are cached, the glocks are not disposed of until the inodes
      are pushed out of the cache, thus extending the lifetime of the glocks,
      and thus bringing down the time for subsequent runs of "ls"
      Signed-off-by: 's avatarSteven Whitehouse <swhiteho@redhat.com>