1. 23 Jan, 2020 2 commits
    • Linus Torvalds's avatar
      readdir: make user_access_begin() use the real access range · 3c2659bd
      Linus Torvalds authored
      In commit 9f79b78e ("Convert filldir[64]() from __put_user() to
      unsafe_put_user()") I changed filldir to not do individual __put_user()
      accesses, but instead use unsafe_put_user() surrounded by the proper
      user_access_begin/end() pair.
      That make them enormously faster on modern x86, where the STAC/CLAC
      games make individual user accesses fairly heavy-weight.
      However, the user_access_begin() range was not really the exact right
      one, since filldir() has the unfortunate problem that it needs to not
      only fill out the new directory entry, it also needs to fix up the
      previous one to contain the proper file offset.
      It's unfortunate, but the "d_off" field in "struct dirent" is _not_ the
      file offset of the directory entry itself - it's the offset of the next
      one.  So we end up backfilling the offset in the previous entry as we
      walk along.
      But since x86 didn't really care about the exact range, and used to be
      the only architecture that did anything fancy in user_access_begin() to
      begin with, the filldir[64]() changes did something lazy, and even
      commented on it:
      	 * Note! This range-checks 'previous' (which may be NULL).
      	 * The real range was checked in getdents
      	if (!user_access_begin(dirent, sizeof(*dirent)))
      		goto efault;
      and it all worked fine.
      But now 32-bit ppc is starting to also implement user_access_begin(),
      and the fact that we faked the range to only be the (possibly not even
      valid) previous directory entry becomes a problem, because ppc32 will
      actually be using the range that is passed in for more than just "check
      that it's user space".
      This is a complete rewrite of Christophe's original patch.
      By saving off the record length of the previous entry instead of a
      pointer to it in the filldir data structures, we can simplify the range
      check and the writing of the previous entry d_off field.  No need for
      any conditionals in the user accesses themselves, although we retain the
      conditional EINTR checking for the "was this the first directory entry"
      signal handling latency logic.
      Fixes: 9f79b78e ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
      Link: https://lore.kernel.org/lkml/a02d3426f93f7eb04960a4d9140902d278cab0bb.1579697910.git.christophe.leroy@c-s.fr/
      Link: https://lore.kernel.org/lkml/408c90c4068b00ea8f1c41cca45b84ec23d4946b.1579783936.git.christophe.leroy@c-s.fr/Reported-and-tested-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Linus Torvalds's avatar
      readdir: be more conservative with directory entry names · 2c6b7bcd
      Linus Torvalds authored
      Commit 8a23eb80 ("Make filldir[64]() verify the directory entry
      filename is valid") added some minimal validity checks on the directory
      entries passed to filldir[64]().  But they really were pretty minimal.
      This fleshes out at least the name length check: we used to disallow
      zero-length names, but really, negative lengths or oevr-long names
      aren't ok either.  Both could happen if there is some filesystem
      corruption going on.
      Now, most filesystems tend to use just an "unsigned char" or similar for
      the length of a directory entry name, so even with a corrupt filesystem
      you should never see anything odd like that.  But since we then use the
      name length to create the directory entry record length, let's make sure
      it actually is half-way sensible.
      Note how POSIX states that the size of a path component is limited by
      NAME_MAX, but we actually use PATH_MAX for the check here.  That's
      because while NAME_MAX is generally the correct maximum name length
      (it's 255, for the same old "name length is usually just a byte on
      disk"), there's nothing in the VFS layer that really cares.
      So the real limitation at a VFS layer is the total pathname length you
      can pass as a filename: PATH_MAX.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 18 Oct, 2019 1 commit
    • Linus Torvalds's avatar
      filldir[64]: remove WARN_ON_ONCE() for bad directory entries · b9959c7a
      Linus Torvalds authored
      This was always meant to be a temporary thing, just for testing and to
      see if it actually ever triggered.
      The only thing that reported it was syzbot doing disk image fuzzing, and
      then that warning is expected.  So let's just remove it before -rc4,
      because the extra sanity testing should probably go to -stable, but we
      don't want the warning to do so.
      Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com
      Fixes: 8a23eb80 ("Make filldir[64]() verify the directory entry filename is valid")
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  3. 07 Oct, 2019 1 commit
    • Linus Torvalds's avatar
      uaccess: implement a proper unsafe_copy_to_user() and switch filldir over to it · c512c691
      Linus Torvalds authored
      In commit 9f79b78e ("Convert filldir[64]() from __put_user() to
      unsafe_put_user()") I made filldir() use unsafe_put_user(), which
      improves code generation on x86 enormously.
      But because we didn't have a "unsafe_copy_to_user()", the dirent name
      copy was also done by hand with unsafe_put_user() in a loop, and it
      turns out that a lot of other architectures didn't like that, because
      unlike x86, they have various alignment issues.
      Most non-x86 architectures trap and fix it up, and some (like xtensa)
      will just fail unaligned put_user() accesses unconditionally.  Which
      makes that "copy using put_user() in a loop" not work for them at all.
      I could make that code do explicit alignment etc, but the architectures
      that don't like unaligned accesses also don't really use the fancy
      "user_access_begin/end()" model, so they might just use the regular old
      __copy_to_user() interface.
      So this commit takes that looping implementation, turns it into the x86
      version of "unsafe_copy_to_user()", and makes other architectures
      implement the unsafe copy version as __copy_to_user() (the same way they
      do for the other unsafe_xyz() accessor functions).
      Note that it only does this for the copying _to_ user space, and we
      still don't have a unsafe version of copy_from_user().
      That's partly because we have no current users of it, but also partly
      because the copy_from_user() case is slightly different and cannot
      efficiently be implemented in terms of a unsafe_get_user() loop (because
      gcc can't do asm goto with outputs).
      It would be trivial to do this using "rep movsb", which would work
      really nicely on newer x86 cores, but really badly on some older ones.
      Al Viro is looking at cleaning up all our user copy routines to make
      this all a non-issue, but for now we have this simple-but-stupid version
      for x86 that works fine for the dirent name copy case because those
      names are short strings and we simply don't need anything fancier.
      Fixes: 9f79b78e ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reported-and-tested-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 05 Oct, 2019 2 commits
    • Linus Torvalds's avatar
      Make filldir[64]() verify the directory entry filename is valid · 8a23eb80
      Linus Torvalds authored
      This has been discussed several times, and now filesystem people are
      talking about doing it individually at the filesystem layer, so head
      that off at the pass and just do it in getdents{64}().
      This is partially based on a patch by Jann Horn, but checks for NUL
      bytes as well, and somewhat simplified.
      There's also commentary about how it might be better if invalid names
      due to filesystem corruption don't cause an immediate failure, but only
      an error at the end of the readdir(), so that people can still see the
      filenames that are ok.
      There's also been discussion about just how much POSIX strictly speaking
      requires this since it's about filesystem corruption.  It's really more
      "protect user space from bad behavior" as pointed out by Jann.  But
      since Eric Biederman looked up the POSIX wording, here it is for context:
       "From readdir:
         The readdir() function shall return a pointer to a structure
         representing the directory entry at the current position in the
         directory stream specified by the argument dirp, and position the
         directory stream at the next entry. It shall return a null pointer
         upon reaching the end of the directory stream. The structure dirent
         defined in the <dirent.h> header describes a directory entry.
        From definitions:
         3.129 Directory Entry (or Link)
         An object that associates a filename with a file. Several directory
         entries can associate names with the same file.
         3.169 Filename
         A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
         characters composing the name may be selected from the set of all
         character values excluding the slash character and the null byte. The
         filenames dot and dot-dot have special meaning. A filename is
         sometimes referred to as a 'pathname component'."
      Note that I didn't bother adding the checks to any legacy interfaces
      that nobody uses.
      Also note that if this ends up being noticeable as a performance
      regression, we can fix that to do a much more optimized model that
      checks for both NUL and '/' at the same time one word at a time.
      We haven't really tended to optimize 'memchr()', and it only checks for
      one pattern at a time anyway, and we really _should_ check for NUL too
      (but see the comment about "soft errors" in the code about why it
      currently only checks for '/')
      See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
      lookup code looks for pathname terminating characters in parallel.
      Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Linus Torvalds's avatar
      Convert filldir[64]() from __put_user() to unsafe_put_user() · 9f79b78e
      Linus Torvalds authored
      We really should avoid the "__{get,put}_user()" functions entirely,
      because they can easily be mis-used and the original intent of being
      used for simple direct user accesses no longer holds in a post-SMAP/PAN
      Manually optimizing away the user access range check makes no sense any
      more, when the range check is generally much cheaper than the "enable
      user accesses" code that the __{get,put}_user() functions still need.
      So instead of __put_user(), use the unsafe_put_user() interface with
      user_access_{begin,end}() that really does generate better code these
      days, and which is generally a nicer interface.  Under some loads, the
      multiple user writes that filldir() does are actually quite noticeable.
      This also makes the dirent name copy use unsafe_put_user() with a couple
      of macros.  We do not want to make function calls with SMAP/PAN
      disabled, and the code this generates is quite good when the
      architecture uses "asm goto" for unsafe_put_user() like x86 does.
      Note that this doesn't bother with the legacy cases.  Nobody should use
      them anyway, so performance doesn't really matter there.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  5. 04 Jan, 2019 1 commit
    • Linus Torvalds's avatar
      Remove 'type' argument from access_ok() function · 96d4f267
      Linus Torvalds authored
      Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
      of the user address range verification function since we got rid of the
      old racy i386-only code to walk page tables by hand.
      It existed because the original 80386 would not honor the write protect
      bit when in kernel mode, so you had to do COW by hand before doing any
      user access.  But we haven't supported that in a long time, and these
      days the 'type' argument is a purely historical artifact.
      A discussion about extending 'user_access_begin()' to do the range
      checking resulted this patch, because there is no way we're going to
      move the old VERIFY_xyz interface to that model.  And it's best done at
      the end of the merge window when I've done most of my merges, so let's
      just get this done once and for all.
      This patch was mostly done with a sed-script, with manual fix-ups for
      the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
      There were a couple of notable cases:
       - csky still had the old "verify_area()" name as an alias.
       - the iter_iov code had magical hardcoded knowledge of the actual
         values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
         really used it)
       - microblaze used the type argument for a debug printout
      but other than those oddities this should be a total no-op patch.
      I tried to fix up all architectures, did fairly extensive grepping for
      access_ok() uses, and the changes are trivial, but I may have missed
      something.  Any missed conversion should be trivially fixable, though.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  6. 02 Apr, 2018 1 commit
  7. 02 Nov, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      How this work was done:
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
      All documentation files were explicitly excluded.
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
         For non */uapi/* files that summary was:
         SPDX license identifier                            # files
         GPL-2.0                                              11139
         and resulted in the first patch in this series.
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
         SPDX license identifier                            # files
         GPL-2.0 WITH Linux-syscall-note                        930
         and resulted in the second patch in this series.
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
         SPDX license identifier                            # files
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
         and that resulted in the third patch in this series.
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  8. 10 Oct, 2017 1 commit
  9. 17 Apr, 2017 1 commit
  10. 24 Dec, 2016 1 commit
  11. 26 May, 2016 1 commit
  12. 02 May, 2016 3 commits
  13. 24 Apr, 2016 1 commit
    • Theodore Ts'o's avatar
      ext4: allow readdir()'s of large empty directories to be interrupted · 1f60fbe7
      Theodore Ts'o authored
      If a directory has a large number of empty blocks, iterating over all
      of them can take a long time, leading to scheduler warnings and users
      getting irritated when they can't kill a process in the middle of one
      of these long-running readdir operations.  Fix this by adding checks to
      ext4_readdir() and ext4_htree_fill_tree().
      This was reverted earlier due to a typo in the original commit where I
      experimented with using signal_pending() instead of
      fatal_signal_pending().  The test was in the wrong place if we were
      going to return signal_pending() since we would end up returning
      duplicant entries.  See 9f2394c9 for a more detailed explanation.
      Added fix as suggested by Linus to check for signal_pending() in
      in the filldir() functions.
      Reported-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Google-Bug-Id: 27880676
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
  14. 22 Jan, 2016 1 commit
    • Al Viro's avatar
      wrappers for ->i_mutex access · 5955102c
      Al Viro authored
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  15. 31 Oct, 2014 1 commit
  16. 04 Jun, 2014 1 commit
  17. 25 Oct, 2013 1 commit
  18. 29 Jun, 2013 4 commits
  19. 23 Feb, 2013 1 commit
  20. 27 Sep, 2012 1 commit
  21. 30 May, 2012 1 commit
  22. 29 Feb, 2012 1 commit
  23. 10 Aug, 2010 1 commit
    • Kevin Winchester's avatar
      vfs: fix warning: 'dirent' is used uninitialized in this function · 85c9fe8f
      Kevin Winchester authored
      	gcc (GCC) 4.5.0 20100610 (prerelease)
      The following warnings appear:
      	fs/readdir.c: In function `filldir64':
      	fs/readdir.c:240:15: warning: `dirent' is used uninitialized in this function
      	fs/readdir.c: In function `filldir':
      	fs/readdir.c:155:15: warning: `dirent' is used uninitialized in this function
      	fs/compat.c: In function `compat_filldir64':
      	fs/compat.c:1071:11: warning: `dirent' is used uninitialized in this function
      	fs/compat.c: In function `compat_filldir':
      	fs/compat.c:984:15: warning: `dirent' is used uninitialized in this function
      The warnings are related to the use of the NAME_OFFSET() macro.  Luckily,
      it appears as though the standard offsetof() macro is what is being
      implemented by NAME_OFFSET(), thus we can fix the warning and use a more
      standard code construct at the same time.
      Signed-off-by: default avatarKevin Winchester <kjwinchester@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  24. 14 Jan, 2009 3 commits
  25. 23 Oct, 2008 1 commit
  26. 25 Aug, 2008 1 commit
  27. 06 Dec, 2007 1 commit
  28. 08 May, 2007 2 commits
  29. 08 Dec, 2006 1 commit
  30. 03 Oct, 2006 1 commit
    • David Howells's avatar
      [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers · afefdbb2
      David Howells authored
      These patches make the kernel pass 64-bit inode numbers internally when
      communicating to userspace, even on a 32-bit system.  They are required
      because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
      for example.  The 64-bit inode numbers are then propagated to userspace
      automatically where the arch supports it.
      Problems have been seen with userspace (eg: ld.so) using the 64-bit inode
      number returned by stat64() or getdents64() to differentiate files, and
      failing because the 64-bit inode number space was compressed to 32-bits, and
      so overlaps occur.
      This patch:
      Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit
      inode number so that 64-bit inode numbers can be passed back to userspace.
      The stat functions then returns the full 64-bit inode number where
      available and where possible.  If it is not possible to represent the inode
      number supplied by the filesystem in the field provided by userspace, then
      error EOVERFLOW will be issued.
      Similarly, the getdents/readdir functions now pass the full 64-bit inode
      number to userspace where possible, returning EOVERFLOW instead when a
      directory entry is encountered that can't be properly represented.
      Note that this means that some inodes will not be stat'able on a 32-bit
      system with old libraries where they were before - but it does mean that
      there will be no ambiguity over what a 32-bit inode number refers to.
      Note similarly that directory scans may be cut short with an error on a
      32-bit system with old libraries where the scan would work before for the
      same reasons.
      It is judged unlikely that this situation will occur because modern glibc
      uses 64-bit capable versions of stat and getdents class functions
      exclusively, and that older systems are unlikely to encounter
      unrepresentable inode numbers anyway.
      [akpm: alpha build fix]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>