Skip to content
  • Joonsoo Kim's avatar
    mm, hugetlb: decrement reserve count if VM_NORESERVE alloc page cache · af0ed73e
    Joonsoo Kim authored
    
    
    If a vma with VM_NORESERVE allocate a new page for page cache, we should
    check whether this area is reserved or not.  If this address is already
    reserved by other process(in case of chg == 0), we should decrement
    reserve count, because this allocated page will go into page cache and
    currently, there is no way to know that this page comes from reserved pool
    or not when releasing inode.  This may introduce over-counting problem to
    reserved count.  With following example code, you can easily reproduce
    this situation.
    
    Assume 2MB, nr_hugepages = 100
    
            size = 20 * MB;
            flag = MAP_SHARED;
            p = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, fd, 0);
            if (p == MAP_FAILED) {
                    fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
                    return -1;
            }
    
            flag = MAP_SHARED | MAP_NORESERVE;
            q = mmap(NULL, size, PROT_READ|PROT_WRITE, flag, fd, 0);
            if (q == MAP_FAILED) {
                    fprintf(stderr, "mmap() failed: %s\n", strerror(errno));
            }
            q[0] = 'c';
    
    After finish the program, run 'cat /proc/meminfo'.  You can see below
    result.
    
    HugePages_Free:      100
    HugePages_Rsvd:        1
    
    To fix this, we should check our mapping type and tracked region.  If our
    mapping is VM_NORESERVE, VM_MAYSHARE and chg is 0, this imply that current
    allocated page will go into page cache which is already reserved region
    when mapping is created.  In this case, we should decrease reserve count.
    As implementing above, this patch solve the problem.
    
    [akpm@linux-foundation.org: fix spelling in comment]
    Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
    Reviewed-by: default avatarWanpeng Li <liwanp@linux.vnet.ibm.com>
    Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Acked-by: default avatarHillf Danton <dhillf@gmail.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
    Cc: David Gibson <david@gibson.dropbear.id.au>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    af0ed73e