• Linus Torvalds's avatar
    readdir: make user_access_begin() use the real access range · 3c2659bd
    Linus Torvalds authored
    In commit 9f79b78e ("Convert filldir[64]() from __put_user() to
    unsafe_put_user()") I changed filldir to not do individual __put_user()
    accesses, but instead use unsafe_put_user() surrounded by the proper
    user_access_begin/end() pair.
    
    That make them enormously faster on modern x86, where the STAC/CLAC
    games make individual user accesses fairly heavy-weight.
    
    However, the user_access_begin() range was not really the exact right
    one, since filldir() has the unfortunate problem that it needs to not
    only fill out the new directory entry, it also needs to fix up the
    previous one to contain the proper file offset.
    
    It's unfortunate, but the "d_off" field in "struct dirent" is _not_ the
    file offset of the directory entry itself - it's the offset of the next
    one.  So we end up backfilling the offset in the previous entry as we
    walk along.
    
    But since x86 didn't really care about the exact range, and used to be
    the only architecture that did anything fancy in user_access_begin() to
    begin with, the filldir[64]() changes did something lazy, and even
    commented on it:
    
    	/*
    	 * Note! This range-checks 'previous' (which may be NULL).
    	 * The real range was checked in getdents
    	 */
    	if (!user_access_begin(dirent, sizeof(*dirent)))
    		goto efault;
    
    and it all worked fine.
    
    But now 32-bit ppc is starting to also implement user_access_begin(),
    and the fact that we faked the range to only be the (possibly not even
    valid) previous directory entry becomes a problem, because ppc32 will
    actually be using the range that is passed in for more than just "check
    that it's user space".
    
    This is a complete rewrite of Christophe's original patch.
    
    By saving off the record length of the previous entry instead of a
    pointer to it in the filldir data structures, we can simplify the range
    check and the writing of the previous entry d_off field.  No need for
    any conditionals in the user accesses themselves, although we retain the
    conditional EINTR checking for the "was this the first directory entry"
    signal handling latency logic.
    
    Fixes: 9f79b78e ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
    Link: https://lore.kernel.org/lkml/a02d3426f93f7eb04960a4d9140902d278cab0bb.1579697910.git.christophe.leroy@c-s.fr/
    Link: https://lore.kernel.org/lkml/408c90c4068b00ea8f1c41cca45b84ec23d4946b.1579783936.git.christophe.leroy@c-s.fr/Reported-and-tested-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3c2659bd