1. 12 Jul, 2019 1 commit
    • Christoph Hellwig's avatar
      mm: consolidate the get_user_pages* implementations · 050a9adc
      Christoph Hellwig authored
      Always build mm/gup.c so that we don't have to provide separate nommu
      stubs.  Also merge the get_user_pages_fast and __get_user_pages_fast stubs
      when HAVE_FAST_GUP into the main implementations, which will never call
      the fast path if HAVE_FAST_GUP is not set.
      This also ensures the new put_user_pages* helpers are available for nommu,
      as those are currently missing, which would create a problem as soon as we
      actually grew users for it.
      Link: http://lkml.kernel.org/r/20190625143715.1689-13-hch@lst.de
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: Khalid Aziz <khalid.aziz@oracle.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  2. 21 May, 2019 1 commit
  3. 14 May, 2019 1 commit
    • Souptick Joarder's avatar
      mm: introduce new vm_map_pages() and vm_map_pages_zero() API · a667d745
      Souptick Joarder authored
      Patch series "mm: Use vm_map_pages() and vm_map_pages_zero() API", v5.
      This patch (of 5):
      Previouly drivers have their own way of mapping range of kernel
      pages/memory into user vma and this was done by invoking vm_insert_page()
      within a loop.
      As this pattern is common across different drivers, it can be generalized
      by creating new functions and using them across the drivers.
      vm_map_pages() is the API which can be used to map kernel memory/pages in
      drivers which have considered vm_pgoff
      vm_map_pages_zero() is the API which can be used to map a range of kernel
      memory/pages in drivers which have not considered vm_pgoff.  vm_pgoff is
      passed as default 0 for those drivers.
      We _could_ then at a later "fix" these drivers which are using
      vm_map_pages_zero() to behave according to the normal vm_pgoff offsetting
      simply by removing the _zero suffix on the function name and if that
      causes regressions, it gives us an easy way to revert.
      Tested on Rockchip hardware and display is working, including talking to
      Lima via prime.
      Link: http://lkml.kernel.org/r/751cb8a0f4c3e67e95c58a3b072937617f338eea.1552921225.git.jrdr.linux@gmail.com
      Signed-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Suggested-by: default avatarRussell King <linux@armlinux.org.uk>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Tested-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
      Cc: Sandy Huang <hjc@rock-chips.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Pawel Osciak <pawel@osciak.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 26 Oct, 2018 1 commit
  5. 17 Aug, 2018 1 commit
  6. 27 Jul, 2018 1 commit
    • Kirill A. Shutemov's avatar
      mm: fix vma_is_anonymous() false-positives · bfd40eaf
      Kirill A. Shutemov authored
      vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
      VMA.  This is unreliable as ->mmap may not set ->vm_ops.
      False-positive vma_is_anonymous() may lead to crashes:
      	next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0
      	prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000
      	pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000
      	flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare)
      	------------[ cut here ]------------
      	kernel BUG at mm/memory.c:1422!
      	invalid opcode: 0000 [#1] SMP KASAN
      	CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136
      	Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      	RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline]
      	RIP: 0010:zap_pud_range mm/memory.c:1466 [inline]
      	RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline]
      	RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508
      	Call Trace:
      	 unmap_single_vma+0x1a0/0x310 mm/memory.c:1553
      	 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644
      	 unmap_mapping_range_vma mm/memory.c:2792 [inline]
      	 unmap_mapping_range_tree mm/memory.c:2813 [inline]
      	 unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845
      	 unmap_mapping_range+0x48/0x60 mm/memory.c:2880
      	 truncate_pagecache+0x54/0x90 mm/truncate.c:800
      	 truncate_setsize+0x70/0xb0 mm/truncate.c:826
      	 simple_setattr+0xe9/0x110 fs/libfs.c:409
      	 notify_change+0xf13/0x10f0 fs/attr.c:335
      	 do_truncate+0x1ac/0x2b0 fs/open.c:63
      	 do_sys_ftruncate+0x492/0x560 fs/open.c:205
      	 __do_sys_ftruncate fs/open.c:215 [inline]
      	 __se_sys_ftruncate fs/open.c:213 [inline]
      	 __x64_sys_ftruncate+0x59/0x80 fs/open.c:213
      	 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      	#include <stdio.h>
      	#include <stddef.h>
      	#include <stdint.h>
      	#include <stdlib.h>
      	#include <string.h>
      	#include <sys/types.h>
      	#include <sys/stat.h>
      	#include <sys/ioctl.h>
      	#include <sys/mman.h>
      	#include <unistd.h>
      	#include <fcntl.h>
      	#define KCOV_INIT_TRACE			_IOR('c', 1, unsigned long)
      	#define KCOV_ENABLE			_IO('c', 100)
      	#define KCOV_DISABLE			_IO('c', 101)
      	#define COVER_SIZE			(1024<<10)
      	#define KCOV_TRACE_PC  0
      	#define KCOV_TRACE_CMP 1
      	int main(int argc, char **argv)
      		int fd;
      		unsigned long *cover;
      		system("mount -t debugfs none /sys/kernel/debug");
      		fd = open("/sys/kernel/debug/kcov", O_RDWR);
      		ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE);
      		cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
      				PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
      		munmap(cover, COVER_SIZE * sizeof(unsigned long));
      		cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
      				PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
      		memset(cover, 0, COVER_SIZE * sizeof(unsigned long));
      		ftruncate(fd, 3UL << 20);
      		return 0;
      This can be fixed by assigning anonymous VMAs own vm_ops and not relying
      on it being NULL.
      If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
      dummy_vm_ops.  This way we will have non-NULL ->vm_ops for all VMAs.
      Link: http://lkml.kernel.org/r/20180724121139.62570-4-kirill.shutemov@linux.intel.com
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: syzbot+3f84280d52be9b7083cc@syzkaller.appspotmail.com
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  7. 21 Jul, 2018 3 commits
    • Linus Torvalds's avatar
      mm: make vm_area_alloc() initialize core fields · 490fc053
      Linus Torvalds authored
      Like vm_area_dup(), it initializes the anon_vma_chain head, and the
      basic mm pointer.
      The rest of the fields end up being different for different users,
      although the plan is to also initialize the 'vm_ops' field to a dummy
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Linus Torvalds's avatar
      mm: make vm_area_dup() actually copy the old vma data · 95faf699
      Linus Torvalds authored
      .. and re-initialize th eanon_vma_chain head.
      This removes some boiler-plate from the users, and also makes it clear
      why it didn't need use the 'zalloc()' version.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Linus Torvalds's avatar
      mm: use helper functions for allocating and freeing vm_area structs · 3928d4f5
      Linus Torvalds authored
      The vm_area_struct is one of the most fundamental memory management
      objects, but the management of it is entirely open-coded evertwhere,
      ranging from allocation and freeing (using kmem_cache_[z]alloc and
      kmem_cache_free) to initializing all the fields.
      We want to unify this in order to end up having some unified
      initialization of the vmas, and the first step to this is to at least
      have basic allocation functions.
      Right now those functions are literally just wrappers around the
      kmem_cache_*() calls.  This is a purely mechanical conversion:
          # new vma:
          kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc()
          # copy old vma
          kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old)
          # free vma
          kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma)
      to the point where the old vma passed in to the vm_area_dup() function
      isn't even used yet (because I've left all the old manual initialization
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  8. 08 Jun, 2018 1 commit
  9. 06 Apr, 2018 1 commit
  10. 02 Apr, 2018 1 commit
  11. 16 Mar, 2018 1 commit
    • Arnd Bergmann's avatar
      mm: remove blackfin MPU support · 1a842913
      Arnd Bergmann authored
      The CONFIG_MPU option was only defined on blackfin, and that architecture
      is now being removed, so the respective code can be simplified.
      A lot of other microcontrollers have an MPU, but I suspect that if we
      want to bring that support back, we'd do it differently anyway.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
  12. 07 Feb, 2018 1 commit
  13. 01 Feb, 2018 1 commit
  14. 07 Sep, 2017 1 commit
  15. 04 Sep, 2017 1 commit
  16. 09 May, 2017 2 commits
    • Michal Hocko's avatar
      mm, vmalloc: use __GFP_HIGHMEM implicitly · 19809c2d
      Michal Hocko authored
      __vmalloc* allows users to provide gfp flags for the underlying
      allocation.  This API is quite popular
        $ git grep "=[[:space:]]__vmalloc\|return[[:space:]]*__vmalloc" | wc -l
      The only problem is that many people are not aware that they really want
      to give __GFP_HIGHMEM along with other flags because there is really no
      reason to consume precious lowmemory on CONFIG_HIGHMEM systems for pages
      which are mapped to the kernel vmalloc space.  About half of users don't
      use this flag, though.  This signals that we make the API unnecessarily
      too complex.
      This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to
      be mapped to the vmalloc space.  Current users which add __GFP_HIGHMEM
      are simplified and drop the flag.
      Link: http://lkml.kernel.org/r/20170307141020.29107-1-mhocko@kernel.org
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Cristopher Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Michal Hocko's avatar
      mm: introduce kv[mz]alloc helpers · a7c3e901
      Michal Hocko authored
      Patch series "kvmalloc", v5.
      There are many open coded kmalloc with vmalloc fallback instances in the
      tree.  Most of them are not careful enough or simply do not care about
      the underlying semantic of the kmalloc/page allocator which means that
      a) some vmalloc fallbacks are basically unreachable because the kmalloc
      part will keep retrying until it succeeds b) the page allocator can
      invoke a really disruptive steps like the OOM killer to move forward
      which doesn't sound appropriate when we consider that the vmalloc
      fallback is available.
      As it can be seen implementing kvmalloc requires quite an intimate
      knowledge if the page allocator and the memory reclaim internals which
      strongly suggests that a helper should be implemented in the memory
      subsystem proper.
      Most callers, I could find, have been converted to use the helper
      instead.  This is patch 6.  There are some more relying on __GFP_REPEAT
      in the networking stack which I have converted as well and Eric Dumazet
      was not opposed [2] to convert them as well.
      [1] http://lkml.kernel.org/r/20170130094940.13546-1-mhocko@kernel.org
      [2] http://lkml.kernel.org/r/1485273626.16328.301.camel@edumazet-glaptop3.roam.corp.google.com
      This patch (of 9):
      Using kmalloc with the vmalloc fallback for larger allocations is a
      common pattern in the kernel code.  Yet we do not have any common helper
      for that and so users have invented their own helpers.  Some of them are
      really creative when doing so.  Let's just add kv[mz]alloc and make sure
      it is implemented properly.  This implementation makes sure to not make
      a large memory pressure for > PAGE_SZE requests (__GFP_NORETRY) and also
      to not warn about allocation failures.  This also rules out the OOM
      killer as the vmalloc is a more approapriate fallback than a disruptive
      user visible action.
      This patch also changes some existing users and removes helpers which
      are specific for them.  In some cases this is not possible (e.g.
      ext4_kvmalloc, libcfs_kvzalloc) because those seems to be broken and
      require GFP_NO{FS,IO} context which is not vmalloc compatible in general
      (note that the page table allocation is GFP_KERNEL).  Those need to be
      fixed separately.
      While we are at it, document that __vmalloc{_node} about unsupported gfp
      mask because there seems to be a lot of confusion out there.
      kvmalloc_node will warn about GFP_KERNEL incompatible (which are not
      superset) flags to catch new abusers.  Existing ones would have to die
      [sfr@canb.auug.org.au: f2fs fixup]
        Link: http://lkml.kernel.org/r/20170320163735.332e64b7@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170306103032.2540-2-mhocko@kernel.org
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Reviewed-by: Andreas Dilger <adilger@dilger.ca>	[ext4 part]
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  17. 02 Mar, 2017 2 commits
    • Ingo Molnar's avatar
      sched/headers: Prepare for new header dependencies before moving code to <linux/sched/mm.h> · 6e84f315
      Ingo Molnar authored
      We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which
      will have to be picked up from other headers and a couple of .c files.
      Create a trivial placeholder <linux/sched/mm.h> file that just
      maps to <linux/sched.h> to make this patch obviously correct and
      The APIs that are going to be moved first are:
      Include the new header in the files that are going to need it.
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Ingo Molnar's avatar
      mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from... · 314ff785
      Ingo Molnar authored
      mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from <linux/sched.h> to <linux/mm_types>
      The <linux/sched.h> header includes various vmacache related defines,
      which are arguably misplaced.
      Move them to mm_types.h and minimize the sched.h impact by putting
      all task vmacache state into a new 'struct vmacache' structure.
      No change in functionality.
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
  18. 25 Feb, 2017 3 commits
  19. 23 Feb, 2017 1 commit
  20. 20 Feb, 2017 1 commit
  21. 24 Dec, 2016 1 commit
  22. 15 Dec, 2016 2 commits
  23. 22 Nov, 2016 1 commit
    • Eric W. Biederman's avatar
      ptrace: Don't allow accessing an undumpable mm · 84d77d3f
      Eric W. Biederman authored
      It is the reasonable expectation that if an executable file is not
      readable there will be no way for a user without special privileges to
      read the file.  This is enforced in ptrace_attach but if ptrace
      is already attached before exec there is no enforcement for read-only
      As the only way to read such an mm is through access_process_vm
      spin a variant called ptrace_access_vm that will fail if the
      target process is not being ptraced by the current process, or
      the current process did not have sufficient privileges when ptracing
      began to read the target processes mm.
      In the ptrace implementations replace access_process_vm by
      ptrace_access_vm.  There remain several ptrace sites that still use
      access_process_vm as they are reading the target executables
      instructions (for kernel consumption) or register stacks.  As such it
      does not appear necessary to add a permission check to those calls.
      This bug has always existed in Linux.
      Fixes: v1.0
      Cc: stable@vger.kernel.org
      Reported-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
  24. 10 Nov, 2016 1 commit
    • Catalin Marinas's avatar
      lkdtm: Do not use flush_icache_range() on user addresses · fcd35857
      Catalin Marinas authored
      The flush_icache_range() API is meant to be used on kernel addresses
      only as it may not have the infrastructure (exception entries) to handle
      user memory faults.
      The lkdtm execute_user_location() function tests the kernel execution of
      user space addresses by mmap'ing an anonymous page, copying some code
      together with cache maintenance and attempting to run it. However, the
      cache maintenance step may fail because of the incorrect API usage
      described above. The patch changes lkdtm to use access_process_vm() for
      copying the code into user space which would take care of the necessary
      cache maintenance.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      [kees: export access_process_vm() for module use]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  25. 25 Oct, 2016 1 commit
    • Lorenzo Stoakes's avatar
      mm: unexport __get_user_pages() · 0d731759
      Lorenzo Stoakes authored
      This patch unexports the low-level __get_user_pages() function.
      Recent refactoring of the get_user_pages* functions allow flags to be
      passed through get_user_pages() which eliminates the need for access to
      this function from its one user, kvm.
      We can see that the two calls to get_user_pages() which replace
      __get_user_pages() in kvm_main.c are equivalent by examining their call
          get_user_pages(start, 1, flags, page, NULL)
          __get_user_pages_locked(current, current->mm, start, 1, page, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, start, 1,
      		     flags | FOLL_TOUCH | FOLL_GET, page, NULL, NULL)
          get_user_pages(addr, 1, flags, NULL, NULL)
          __get_user_pages_locked(current, current->mm, addr, 1, NULL, NULL, NULL,
      			    false, flags | FOLL_TOUCH)
          __get_user_pages(current, current->mm, addr, 1, flags | FOLL_TOUCH, NULL,
      		     NULL, NULL)
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  26. 19 Oct, 2016 5 commits
  27. 18 Oct, 2016 2 commits
  28. 26 Jul, 2016 1 commit