Skip to content
  • Steve Capper's avatar
    mm: introduce a general RCU get_user_pages_fast() · 2667f50e
    Steve Capper authored
    
    
    This series implements general forms of get_user_pages_fast and
    __get_user_pages_fast in core code and activates them for arm and arm64.
    
    These are required for Transparent HugePages to function correctly, as a
    futex on a THP tail will otherwise result in an infinite loop (due to the
    core implementation of __get_user_pages_fast always returning 0).
    
    Unfortunately, a futex on THP tail can be quite common for certain
    workloads; thus THP is unreliable without a __get_user_pages_fast
    implementation.
    
    This series may also be beneficial for direct-IO heavy workloads and
    certain KVM workloads.
    
    This patch (of 6):
    
    get_user_pages_fast() attempts to pin user pages by walking the page
    tables directly and avoids taking locks.  Thus the walker needs to be
    protected from page table pages being freed from under it, and needs to
    block any THP splits.
    
    One way to achieve this is to have the walker disable interrupts, and rely
    on IPIs from the TLB flushing code blocking before the page table pages
    are freed.
    
    On some platforms we have hardware broadcast of TLB invalidations, thus
    the TLB flushing code doesn't necessarily need to broadcast IPIs; and
    spuriously broadcasting IPIs can hurt system performance if done too
    often.
    
    This problem has been solved on PowerPC and Sparc by batching up page
    table pages belonging to more than one mm_user, then scheduling an
    rcu_sched callback to free the pages.  This RCU page table free logic has
    been promoted to core code and is activated when one enables
    HAVE_RCU_TABLE_FREE.  Unfortunately, these architectures implement their
    own get_user_pages_fast routines.
    
    The RCU page table free logic coupled with an IPI broadcast on THP split
    (which is a rare event), allows one to protect a page table walker by
    merely disabling the interrupts during the walk.
    
    This patch provides a general RCU implementation of get_user_pages_fast
    that can be used by architectures that perform hardware broadcast of TLB
    invalidations.
    
    It is based heavily on the PowerPC implementation by Nick Piggin.
    
    [akpm@linux-foundation.org: various comment fixes]
    Signed-off-by: default avatarSteve Capper <steve.capper@linaro.org>
    Tested-by: default avatarDann Frazier <dann.frazier@canonical.com>
    Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
    Acked-by: default avatarHugh Dickins <hughd@google.com>
    Cc: Russell King <rmk@arm.linux.org.uk>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Mel Gorman <mel@csn.ul.ie>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: Christoffer Dall <christoffer.dall@linaro.org>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    2667f50e