Skip to content
  • Kirill A. Shutemov's avatar
    mm: rework mapcount accounting to enable 4k mapping of THPs · 53f9263b
    Kirill A. Shutemov authored
    
    
    We're going to allow mapping of individual 4k pages of THP compound.  It
    means we need to track mapcount on per small page basis.
    
    Straight-forward approach is to use ->_mapcount in all subpages to track
    how many time this subpage is mapped with PMDs or PTEs combined.  But
    this is rather expensive: mapping or unmapping of a THP page with PMD
    would require HPAGE_PMD_NR atomic operations instead of single we have
    now.
    
    The idea is to store separately how many times the page was mapped as
    whole -- compound_mapcount.  This frees up ->_mapcount in subpages to
    track PTE mapcount.
    
    We use the same approach as with compound page destructor and compound
    order to store compound_mapcount: use space in first tail page,
    ->mapping this time.
    
    Any time we map/unmap whole compound page (THP or hugetlb) -- we
    increment/decrement compound_mapcount.  When we map part of compound
    page with PTE we operate on ->_mapcount of the subpage.
    
    page_mapcount() counts both: PTE and PMD mappings of the page.
    
    Basically, we have mapcount for a subpage spread over two counters.  It
    makes tricky to detect when last mapcount for a page goes away.
    
    We introduced PageDoubleMap() for this.  When we split THP PMD for the
    first time and there's other PMD mapping left we offset up ->_mapcount
    in all subpages by one and set PG_double_map on the compound page.
    These additional references go away with last compound_mapcount.
    
    This approach provides a way to detect when last mapcount goes away on
    per small page basis without introducing new overhead for most common
    cases.
    
    [akpm@linux-foundation.org: fix typo in comment]
    [mhocko@suse.com: ignore partial THP when moving task]
    Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Tested-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Acked-by: default avatarJerome Marchand <jmarchan@redhat.com>
    Cc: Sasha Levin <sasha.levin@oracle.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Jerome Marchand <jmarchan@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Steve Capper <steve.capper@linaro.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    53f9263b