Skip to content
  • Doug Berger's avatar
    mm: include CMA pages in lowmem_reserve at boot · e08d3fdf
    Doug Berger authored
    The lowmem_reserve arrays provide a means of applying pressure against
    allocations from lower zones that were targeted at higher zones.  Its
    values are a function of the number of pages managed by higher zones and
    are assigned by a call to the setup_per_zone_lowmem_reserve() function.
    
    The function is initially called at boot time by the function
    init_per_zone_wmark_min() and may be called later by accesses of the
    /proc/sys/vm/lowmem_reserve_ratio sysctl file.
    
    The function init_per_zone_wmark_min() was moved up from a module_init to
    a core_initcall to resolve a sequencing issue with khugepaged.
    Unfortunately this created a sequencing issue with CMA page accounting.
    
    The CMA pages are added to the managed page count of a zone when
    cma_init_reserved_areas() is called at boot also as a core_initcall.  This
    makes it uncertain whether the CMA pages will be added to the managed page
    counts of their zones before or after the call to
    init_per_zone_wmark_min() as it becomes dependent on link order.  With the
    current link order the pages are added to the managed count after the
    lowmem_reserve arrays are initialized at boot.
    
    This means the lowmem_reserve values at boot may be lower than the values
    used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the
    ratio values are unchanged.
    
    In many cases the difference is not significant, but for example
    an ARM platform with 1GB of memory and the following memory layout
    
      cma: Reserved 256 MiB at 0x0000000030000000
      Zone ranges:
        DMA      [mem 0x0000000000000000-0x000000002fffffff]
        Normal   empty
        HighMem  [mem 0x0000000030000000-0x000000003fffffff]
    
    would result in 0 lowmem_reserve for the DMA zone.  This would allow
    userspace to deplete the DMA zone easily.
    
    Funnily enough
    
      $ cat /proc/sys/vm/lowmem_reserve_ratio
    
    would fix up the situation because as a side effect it forces
    setup_per_zone_lowmem_reserve.
    
    This commit breaks the link order dependency by invoking
    init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages
    have the chance to be properly accounted in their zone(s) and allowing
    the lowmem_reserve arrays to receive consistent values.
    
    Fixes: bc22af74
    
     ("mm: update min_free_kbytes from khugepaged after core initialization")
    Signed-off-by: default avatarDoug Berger <opendmb@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Jason Baron <jbaron@akamai.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: <stable@vger.kernel.org>
    Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com
    
    
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    e08d3fdf