• John Hubbard's avatar
    mm/gup: factor out duplicate code from four routines · a43e9820
    John Hubbard authored
    Patch series "mm/gup: prereqs to track dma-pinned pages: FOLL_PIN", v12.
    
    Overview:
    
    This is a prerequisite to solving the problem of proper interactions
    between file-backed pages, and [R]DMA activities, as discussed in [1],
    [2], [3], and in a remarkable number of email threads since about
    2017.  :)
    
    A new internal gup flag, FOLL_PIN is introduced, and thoroughly
    documented in the last patch's Documentation/vm/pin_user_pages.rst.
    
    I believe that this will provide a good starting point for doing the
    layout lease work that Ira Weiny has been working on.  That's because
    these new wrapper functions provide a clean, constrained, systematically
    named set of functionality that, again, is required in order to even
    know if a page is "dma-pinned".
    
    In contrast to earlier approaches, the page tracking can be
    incrementally applied to the kernel call sites that, until now, have
    been simply calling get_user_pages() ("gup").  In other words, opt-in by
    changing from this:
    
        get_user_pages() (sets FOLL_GET)
        put_page()
    
    to this:
        pin_user_pages() (sets FOLL_PIN)
        unpin_user_page()
    
    Testing:
    
    * I've done some overall kernel testing (LTP, and a few other goodies),
      and some directed testing to exercise some of the changes. And as you
      can see, gup_benchmark is enhanced to exercise this. Basically, I've
      been able to runtime test the core get_user_pages() and
      pin_user_pages() and related routines, but not so much on several of
      the call sites--but those are generally just a couple of lines
      changed, each.
    
      Not much of the kernel is actually using this, which on one hand
      reduces risk quite a lot. But on the other hand, testing coverage
      is low. So I'd love it if, in particular, the Infiniband and PowerPC
      folks could do a smoke test of this series for me.
    
      Runtime testing for the call sites so far is pretty light:
    
        * io_uring: Some directed tests from liburing exercise this, and
                    they pass.
        * process_vm_access.c: A small directed test passes.
        * gup_benchmark: the enhanced version hits the new gup.c code, and
                         passes.
        * infiniband: Ran rdma-core tests: rdma-core/build/bin/run_tests.py
        * VFIO: compiles (I'm vowing to set up a run time test soon, but it's
                          not ready just yet)
        * powerpc: it compiles...
        * drm/via: compiles...
        * goldfish: compiles...
        * net/xdp: compiles...
        * media/v4l2: compiles...
    
    [1] Some slow progress on get_user_pages() (Apr 2, 2019): https://lwn.net/Articles/784574/
    [2] DMA and get_user_pages() (LPC: Dec 12, 2018): https://lwn.net/Articles/774411/
    [3] The trouble with get_user_pages() (Apr 30, 2018): https://lwn.net/Articles/753027/
    
    This patch (of 22):
    
    There are four locations in gup.c that have a fair amount of code
    duplication.  This means that changing one requires making the same
    changes in four places, not to mention reading the same code four times,
    and wondering if there are subtle differences.
    
    Factor out the common code into static functions, thus reducing the
    overall line count and the code's complexity.
    
    Also, take the opportunity to slightly improve the efficiency of the
    error cases, by doing a mass subtraction of the refcount, surrounded by
    get_page()/put_page().
    
    Also, further simplify (slightly), by waiting until the the successful
    end of each routine, to increment *nr.
    
    Link: http://lkml.kernel.org/r/20200107224558.2362728-2-jhubbard@nvidia.com
    
    Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarJérôme Glisse <jglisse@redhat.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Cc: Kirill A. Shutemov <kirill@shutemov.name>
    Cc: Ira Weiny <ira.weiny@intel.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Alex Williamson <alex.williamson@redhat.com>
    Cc: Björn Töpel <bjorn.topel@intel.com>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Cc: Jason Gunthorpe <jgg@mellanox.com>
    Cc: Jason Gunthorpe <jgg@ziepe.ca>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Leon Romanovsky <leonro@mellanox.com>
    Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
    Cc: Mike Rapoport <rppt@linux.ibm.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    a43e9820
gup.c 66.3 KB