Skip to content
  • Andi Kleen's avatar
    mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB · 42d7395f
    Andi Kleen authored
    
    
    There was some desire in large applications using MAP_HUGETLB or
    SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
    others.  This is useful together with NUMA policy: use 2MB interleaving
    on some mappings, but 1GB on local mappings.
    
    This patch extends the IPC/SHM syscall interfaces slightly to allow
    specifying the page size.
    
    It borrows some upper bits in the existing flag arguments and allows
    encoding the log of the desired page size in addition to the *_HUGETLB
    flag.  When 0 is specified the default size is used, this makes the
    change fully compatible.
    
    Extending the internal hugetlb code to handle this is straight forward.
    Instead of a single mount it just keeps an array of them and selects the
    right mount based on the specified page size.  When no page size is
    specified it uses the mount of the default page size.
    
    The change is not visible in /proc/mounts because internal mounts don't
    appear there.  It also has very little overhead: the additional mounts
    just consume a super block, but not more memory when not used.
    
    I also exported the new flags to the user headers (they were previously
    under __KERNEL__).  Right now only symbols for x86 and some other
    architecture for 1GB and 2MB are defined.  The interface should already
    work for all other architectures though.  Only architectures that define
    multiple hugetlb sizes actually need it (that is currently x86, tile,
    powerpc).  However tile and powerpc have user configurable hugetlb
    sizes, so it's not easy to add defines.  A program on those
    architectures would need to query sysfs and use the appropiate log2.
    
    [akpm@linux-foundation.org: cleanups]
    [rientjes@google.com: fix build]
    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
    Cc: Michael Kerrisk <mtk.manpages@gmail.com>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Hillf Danton <dhillf@gmail.com>
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    42d7395f