Commit 72231b03 authored by Joonsoo Kim's avatar Joonsoo Kim Committed by Linus Torvalds
Browse files

mm, hugetlb: add VM_NORESERVE check in vma_has_reserves()

If we map the region with MAP_NORESERVE and MAP_SHARED, we can skip to
check reserve counting and eventually we cannot be ensured to allocate a
huge page in fault time.  With following example code, you can easily find
this situation.

Assume 2MB, nr_hugepages = 100

        fd = hugetlbfs_unlinked_fd();
        if (fd < 0)
                return 1;

        size = 200 * MB;
        flag = MAP_SHARED;
        p = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, fd, 0);
        if (p == MAP_FAILED) {
                fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
                return -1;

        size = 2 * MB;
        p = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, -1, 0);
        if (p == MAP_FAILED) {
                fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
        p[0] = '0';

During executing sleep(10), run 'cat /proc/meminfo' on another process.

HugePages_Free:       99
HugePages_Rsvd:      100

Number of free should be higher or equal than number of reserve, but this
aren't.  This represent that non reserved shared mapping steal a reserved
page.  Non reserved shared mapping should not eat into reserve space.

If we consider VM_NORESERVE in vma_has_reserve() and return 0 which mean
that we don't have reserved pages, then we check that we have enough free
pages in dequeue_huge_page_vma().  This prevent to steal a reserved page.

With this change, above test generate a SIGBUG which is correct, because
all free pages are reserved and non reserved shared mapping can't get a
free page.
Signed-off-by: default avatarJoonsoo Kim <>
Reviewed-by: default avatarWanpeng Li <>
Reviewed-by: default avatarAneesh Kumar K.V <>
Acked-by: default avatarHillf Danton <>
Acked-by: default avatarMichal Hocko <>
Cc: Naoya Horiguchi <>
Cc: Rik van Riel <>
Cc: Mel Gorman <>
Cc: "Aneesh Kumar K.V" <>
Cc: KAMEZAWA Hiroyuki <>
Cc: Hugh Dickins <>
Cc: Davidlohr Bueso <>
Cc: David Gibson <>
Signed-off-by: default avatarAndrew Morton <>
Signed-off-by: default avatarLinus Torvalds <>
parent 37a2140d
......@@ -464,6 +464,8 @@ void reset_vma_resv_huge_pages(struct vm_area_struct *vma)
/* Returns true if the VMA has associated reserve pages */
static int vma_has_reserves(struct vm_area_struct *vma)
if (vma->vm_flags & VM_NORESERVE)
return 0;
if (vma->vm_flags & VM_MAYSHARE)
return 1;
if (is_vma_resv_set(vma, HPAGE_RESV_OWNER))
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment