1. 07 Sep, 2017 1 commit
  2. 06 Jul, 2017 5 commits
  3. 19 Jun, 2017 1 commit
    • Hugh Dickins's avatar
      mm: larger stack guard gap, between vmas · 1be7107f
      Hugh Dickins authored
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: default avatarOleg Nesterov <oleg@redhat.com>
      Original-patch-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Tested-by: Helge Deller <deller@gmx.de> # parisc
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 13 Jun, 2017 1 commit
  5. 02 Jun, 2017 1 commit
  6. 03 May, 2017 1 commit
  7. 23 Apr, 2017 1 commit
    • Ingo Molnar's avatar
      Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" · 6dd29b3d
      Ingo Molnar authored
      This reverts commit 2947ba05.
      Dan Williams reported dax-pmem kernel warnings with the following signature:
         WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200
         percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic
      ... and bisected it to this commit, which suggests possible memory corruption
      caused by the x86 fast-GUP conversion.
      He also pointed out:
        This is similar to the backtrace when we were not properly handling
        pud faults and was fixed with this commit: 220ced16
       "mm: fix
        get_user_pages() vs device-dax pud mappings"
        I've found some missing _devmap checks in the generic
        get_user_pages_fast() path, but this does not fix the regression
      So given that there are known bugs, and a pretty robust looking bisection
      points to this commit suggesting that are unknown bugs in the conversion
      as well, revert it for the time being - we'll re-try in v4.13.
      Reported-by: default avatarDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dann.frazier@canonical.com
      Cc: dave.hansen@intel.com
      Cc: steve.capper@linaro.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
  8. 18 Mar, 2017 7 commits
  9. 13 Mar, 2017 1 commit
  10. 09 Mar, 2017 1 commit
  11. 02 Mar, 2017 1 commit
  12. 25 Feb, 2017 2 commits
  13. 23 Feb, 2017 1 commit
  14. 15 Dec, 2016 2 commits
    • Lorenzo Stoakes's avatar
      mm: unexport __get_user_pages_unlocked() · 8b7457ef
      Lorenzo Stoakes authored
      Unexport the low-level __get_user_pages_unlocked() function and replaces
      invocations with calls to more appropriate higher-level functions.
      In hva_to_pfn_slow() we are able to replace __get_user_pages_unlocked()
      with get_user_pages_unlocked() since we can now pass gup_flags.
      In async_pf_execute() and process_vm_rw_single_vec() we need to pass
      different tsk, mm arguments so get_user_pages_remote() is the sane
      replacement in these cases (having added manual acquisition and release
      of mmap_sem.)
      Additionally get_user_pages_remote() reintroduces use of the FOLL_TOUCH
      flag.  However, this flag was originally silently dropped by commit
      1e987790 ("mm/gup: Introduce get_user_pages_remote()"), so this
      appears to have been unintentional and reintroducing it is therefore not
      an issue.
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/20161027095141.2569-3-lstoakes@gmail.com
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Lorenzo Stoakes's avatar
      mm: add locked parameter to get_user_pages_remote() · 5b56d49f
      Lorenzo Stoakes authored
      Patch series "mm: unexport __get_user_pages_unlocked()".
      This patch series continues the cleanup of get_user_pages*() functions
      taking advantage of the fact we can now pass gup_flags as we please.
      It firstly adds an additional 'locked' parameter to
      get_user_pages_remote() to allow for its callers to utilise
      VM_FAULT_RETRY functionality.  This is necessary as the invocation of
      __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
      this and no other existing higher level function would allow it to do
      Secondly existing callers of __get_user_pages_unlocked() are replaced
      with the appropriate higher-level replacement -
      get_user_pages_unlocked() if the current task and memory descriptor are
      referenced, or get_user_pages_remote() if other task/memory descriptors
      are referenced (having acquiring mmap_sem.)
      This patch (of 2):
      Add a int *locked parameter to get_user_pages_remote() to allow
      VM_FAULT_RETRY faulting behaviour similar to get_u...
  15. 13 Dec, 2016 2 commits
  16. 25 Oct, 2016 1 commit
    • Lorenzo Stoakes's avatar
      mm: unexport __get_user_pages() · 0d731759
      Lorenzo Stoakes authored
      This patch unexports the low-level __get_user_pages() function.
      Recent refactoring of the get_user_pages* functions allow flags to be
      passed through get_user_pages() which eliminates the need for access to
      this function from its one user, kvm.
      We can see that the two calls to get_user_pages() which replace
      __get_user_pages() in kvm_main.c are equivalent by examining their call
          get_user_pages(start, 1, flags, page, NULL)
          __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, start, 1,
      		     flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)
          get_user_pages(addr, 1, flags, NULL, NULL)
          __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
      		     NULL, NULL)
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  17. 19 Oct, 2016 3 commits
  18. 18 Oct, 2016 4 commits
  19. 26 Jul, 2016 3 commits
  20. 05 Jul, 2016 1 commit
    • Paolo Bonzini's avatar
      KVM: MMU: try to fix up page faults before giving up · add6a0cd
      Paolo Bonzini authored
      The vGPU folks would like to trap the first access to a BAR by setting
      vm_ops on the VMAs produced by mmap-ing a VFIO device.  The fault handler
      then can use remap_pfn_range to place some non-reserved pages in the VMA.
      This kind of VM_PFNMAP mapping is not handled by KVM, but follow_pfn
      and fixup_user_fault together help supporting it.  The patch also supports
      VM_MIXEDMAP vmas where the pfns are not reserved and thus subject to
      reference counting.
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Tested-by: default avatarNeo Jia <cjia@nvidia.com>
      Reported-by: default avatarKirti Wankhede <kwankhede@nvidia.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>