Skip to content
Snippets Groups Projects
Select Git revision
  • d8c19014bba8f565d8a2f1f46b4e38d1d97bf1a7
  • vme-testing default
  • ci-test
  • master
  • remoteproc
  • am625-sk-ov5640
  • pcal6534-upstreaming
  • lps22df-upstreaming
  • msc-upstreaming
  • imx8mp
  • iio/noa1305
  • vme-next
  • vme-next-4.14-rc4
  • v4.14-rc4
  • v4.14-rc3
  • v4.14-rc2
  • v4.14-rc1
  • v4.13
  • vme-next-4.13-rc7
  • v4.13-rc7
  • v4.13-rc6
  • v4.13-rc5
  • v4.13-rc4
  • v4.13-rc3
  • v4.13-rc2
  • v4.13-rc1
  • v4.12
  • v4.12-rc7
  • v4.12-rc6
  • v4.12-rc5
  • v4.12-rc4
  • v4.12-rc3
32 results

linux

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Dongli Zhang authored and Jakub Kicinski committed
    The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
    This ends up to page_frag_alloc() to allocate skb->data from
    page_frag_cache->va.
    
    During the memory pressure, page_frag_cache->va may be allocated as
    pfmemalloc page. As a result, the skb->pfmemalloc is always true as
    skb->data is from page_frag_cache->va. The skb will be dropped if the
    sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
    under memory pressure.
    
    However, once kernel is not under memory pressure any longer (suppose large
    amount of memory pages are just reclaimed), the page_frag_alloc() may still
    re-use the prior pfmemalloc page_frag_cache->va to allocate skb->data. As a
    result, the skb->pfmemalloc is always true unless page_frag_cache->va is
    re-allocated, even if the kernel is not under memory pressure any longer.
    
    Here is how kernel runs into issue.
    
    1. The kernel is under memory pressure and allocation of
    PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail. Instead,
    the pfmemalloc page is allocated for page_frag_cache->va.
    
    2: All skb->data from page_frag_cache->va (pfmemalloc) will have
    skb->pfmemalloc=true. The skb will always be dropped by sock without
    SOCK_MEMALLOC. This is an expected behaviour.
    
    3. Suppose a large amount of pages are reclaimed and kernel is not under
    memory pressure any longer. We expect skb->pfmemalloc drop will not happen.
    
    4. Unfortunately, page_frag_alloc() does not proactively re-allocate
    page_frag_alloc->va and will always re-use the prior pfmemalloc page. The
    skb->pfmemalloc is always true even kernel is not under memory pressure any
    longer.
    
    Fix this by freeing and re-allocating the page instead of recycling it.
    
    References: https://lore.kernel.org/lkml/20201103193239.1807-1-dongli.zhang@oracle.com/
    References: https://lore.kernel.org/linux-mm/20201105042140.5253-1-willy@infradead.org/
    
    
    Suggested-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
    Cc: Bert Barbe <bert.barbe@oracle.com>
    Cc: Rama Nichanamatlu <rama.nichanamatlu@oracle.com>
    Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
    Cc: Manjunath Patil <manjunath.b.patil@oracle.com>
    Cc: Joe Jin <joe.jin@oracle.com>
    Cc: SRINIVAS <srinivas.eeda@oracle.com>
    Fixes: 79930f58 ("net: do not deplete pfmemalloc reserve")
    Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20201115201029.11903-1-dongli.zhang@oracle.com
    
    
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    d8c19014
    History
    Name Last commit Last update