1. 05 Nov, 2017 1 commit
  2. 17 Jul, 2017 1 commit
    • David Howells's avatar
      VFS: Differentiate mount flags (MS_*) from internal superblock flags · e462ec50
      David Howells authored
      Differentiate the MS_* flags passed to mount(2) from the internal flags set
      in the super_block's s_flags.  s_flags are now called SB_*, with the names
      and the values for the moment mirroring the MS_* flags that they're
      equivalent to.
      
      In this patch, just the headers are altered and some kernel code where
      blind automated conversion isn't necessarily correct.
      
      Note that this shows up some interesting issues:
      
       (1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV ->
           MNT_NODEV) without passing this on to the filesystem, but some
           filesystems set such flags anyway.
      
       (2) The ->remount_fs() methods of some filesystems adjust the *flags
           argument by setting MS_* flags in it, such as MS_NOATIME - but these
           flags are then scrubbed by do_remount_sb() (only the occupants of
           MS_RMT_MASK are permitted: MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK,
           MS_I_VERSION and MS_LAZYTIME)
      
      I'm not sure what's the best way to solve all these cases.
      Suggested-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e462ec50
  3. 03 Apr, 2017 1 commit
  4. 09 Dec, 2016 1 commit
  5. 27 Sep, 2016 1 commit
  6. 22 Sep, 2016 1 commit
  7. 31 Jul, 2016 1 commit
  8. 28 May, 2016 1 commit
  9. 02 May, 2016 4 commits
    • Al Viro's avatar
    • Al Viro's avatar
      introduce a parallel variant of ->iterate() · 61922694
      Al Viro authored
      New method: ->iterate_shared().  Same arguments as in ->iterate(),
      called with the directory locked only shared.  Once all filesystems
      switch, the old one will be gone.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      61922694
    • Al Viro's avatar
      parallel lookups: actual switch to rwsem · 9902af79
      Al Viro authored
      ta-da!
      
      The main issue is the lack of down_write_killable(), so the places
      like readdir.c switched to plain inode_lock(); once killable
      variants of rwsem primitives appear, that'll be dealt with.
      
      lockdep side also might need more work
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9902af79
    • Al Viro's avatar
      parallel lookups machinery, part 2 · 84e710da
      Al Viro authored
      We'll need to verify that there's neither a hashed nor in-lookup
      dentry with desired parent/name before adding to in-lookup set.
      
      One possible solution would be to hold the parent's ->d_lock through
      both checks, but while the in-lookup set is relatively small at any
      time, dcache is not.  And holding the parent's ->d_lock through
      something like __d_lookup_rcu() would suck too badly.
      
      So we leave the parent's ->d_lock alone, which means that we watch
      out for the following scenario:
      	* we verify that there's no hashed match
      	* existing in-lookup match gets hashed by another process
      	* we verify that there's no in-lookup matches and decide
      that everything's fine.
      
      Solution: per-directory kinda-sorta seqlock, bumped around the times
      we hash something that used to be in-lookup or move (and hash)
      something in place of in-lookup.  Then the above would turn into
      	* read the counter
      	* do dcache lookup
      	* if no matches found, check for in-lookup matches
      	* if there had been none of those either, check if the
      counter has changed; repeat if it has.
      
      The "kinda-sorta" part is due to the fact that we don't have much spare
      space in inode.  There is a spare word (shared with i_bdev/i_cdev/i_pipe),
      so the counter part is not a problem, but spinlock is a different story.
      
      We could use the parent's ->d_lock, and it would be less painful in
      terms of contention, for __d_add() it would be rather inconvenient to
      grab; we could do that (using lock_parent()), but...
      
      Fortunately, we can get serialization on the counter itself, and it
      might be a good idea in general; we can use cmpxchg() in a loop to
      get from even to odd and smp_store_release() from odd to even.
      
      This commit adds the counter and updating logics; the readers will be
      added in the next commit.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      84e710da
  10. 11 Apr, 2016 1 commit
  11. 14 Jan, 2016 1 commit
    • Al Viro's avatar
      Make sure that highmem pages are not added to symlink page cache · e8ecde25
      Al Viro authored
      inode_nohighmem() is sufficient to make sure that page_get_link()
      won't try to allocate a highmem page.  Moreover, it is sufficient
      to make sure that page_symlink/__page_symlink won't do the same
      thing.  However, any filesystem that manually preseeds the symlink's
      page cache upon symlink(2) needs to make sure that the page it
      inserts there won't be a highmem one.
      
      Fortunately, only nfs and shmem have run afoul of that...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e8ecde25
  12. 30 Dec, 2015 1 commit
  13. 09 Dec, 2015 2 commits
    • Al Viro's avatar
      replace ->follow_link() with new method that could stay in RCU mode · 6b255391
      Al Viro authored
      new method: ->get_link(); replacement of ->follow_link().  The differences
      are:
      	* inode and dentry are passed separately
      	* might be called both in RCU and non-RCU mode;
      the former is indicated by passing it a NULL dentry.
      	* when called that way it isn't allowed to block
      and should return ERR_PTR(-ECHILD) if it needs to be called
      in non-RCU mode.
      
      It's a flagday change - the old method is gone, all in-tree instances
      converted.  Conversion isn't hard; said that, so far very few instances
      do not immediately bail out when called in RCU mode.  That'll change
      in the next commits.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      6b255391
    • Al Viro's avatar
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro authored
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  14. 01 Jul, 2015 1 commit
  15. 04 Jun, 2015 1 commit
  16. 15 May, 2015 1 commit
  17. 12 Apr, 2015 2 commits
  18. 19 Nov, 2014 2 commits
  19. 03 Apr, 2014 1 commit
    • Johannes Weiner's avatar
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner authored
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3
  20. 09 Nov, 2013 1 commit
  21. 10 Sep, 2013 2 commits
  22. 29 Jun, 2013 2 commits
  23. 26 Feb, 2013 1 commit
    • Jeff Layton's avatar
      vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op · ecf3d1f1
      Jeff Layton authored
      The following set of operations on a NFS client and server will cause
      
          server# mkdir a
          client# cd a
          server# mv a a.bak
          client# sleep 30  # (or whatever the dir attrcache timeout is)
          client# stat .
          stat: cannot stat `.': Stale NFS file handle
      
      Obviously, we should not be getting an ESTALE error back there since the
      inode still exists on the server. The problem is that the lookup code
      will call d_revalidate on the dentry that "." refers to, because NFS has
      FS_REVAL_DOT set.
      
      nfs_lookup_revalidate will see that the parent directory has changed and
      will try to reverify the dentry by redoing a LOOKUP. That of course
      fails, so the lookup code returns ESTALE.
      
      The problem here is that d_revalidate is really a bad fit for this case.
      What we really want to know at this point is whether the inode is still
      good or not, but we don't really care what name it goes by or whether
      the dcache is still valid.
      
      Add a new d_op->d_weak_revalidate operation and have complete_walk call
      that instead of d_revalidate. The intent there is to allow for a
      "weaker" d_revalidate that just checks to see whether the inode is still
      good. This is also gives us an opportunity to kill off the FS_REVAL_DOT
      special casing.
      
      [AV: changed method name, added note in porting, fixed confusion re
      having it possibly called from RCU mode (it won't be)]
      
      Cc: NeilBrown <neilb@suse.de>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ecf3d1f1
  24. 20 Dec, 2012 1 commit
  25. 03 Aug, 2012 1 commit
  26. 14 Jul, 2012 4 commits
  27. 06 May, 2012 1 commit
  28. 21 Mar, 2012 1 commit
  29. 25 Jul, 2011 1 commit