Skip to content
Snippets Groups Projects
  1. Mar 06, 2019
  2. Jan 10, 2019
  3. Jan 09, 2019
    • Mike Kravetz's avatar
      hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" · ddeaab32
      Mike Kravetz authored
      This reverts b43a9990
      
      The reverted commit caused issues with migration and poisoning of anon
      huge pages.  The LTP move_pages12 test will cause an "unable to handle
      kernel NULL pointer" BUG would occur with stack similar to:
      
        RIP: 0010:down_write+0x1b/0x40
        Call Trace:
          migrate_pages+0x81f/0xb90
          __ia32_compat_sys_migrate_pages+0x190/0x190
          do_move_pages_to_node.isra.53.part.54+0x2a/0x50
          kernel_move_pages+0x566/0x7b0
          __x64_sys_move_pages+0x24/0x30
          do_syscall_64+0x5b/0x180
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The purpose of the reverted patch was to fix some long existing races
      with huge pmd sharing.  It used i_mmap_rwsem for this purpose with the
      idea that this could also be used to address truncate/page fault races
      with another patch.  Further analysis has determined that i_mmap_rwsem
      can not be used to address all these hugetlbfs synchronization issues.
      Therefore, revert this patch while working an another approach to the
      underlying issues.
      
      Link: http://lkml.kernel.org/r/20190103235452.29335-2-mike.kravetz@oracle.com
      
      
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ddeaab32
  4. Dec 28, 2018
  5. Nov 30, 2018
  6. Oct 05, 2018
    • Mike Kravetz's avatar
      mm: migration: fix migration of huge PMD shared pages · 017b1660
      Mike Kravetz authored
      The page migration code employs try_to_unmap() to try and unmap the source
      page.  This is accomplished by using rmap_walk to find all vmas where the
      page is mapped.  This search stops when page mapcount is zero.  For shared
      PMD huge pages, the page map count is always 1 no matter the number of
      mappings.  Shared mappings are tracked via the reference count of the PMD
      page.  Therefore, try_to_unmap stops prematurely and does not completely
      unmap all mappings of the source page.
      
      This problem can result is data corruption as writes to the original
      source page can happen after contents of the page are copied to the target
      page.  Hence, data is lost.
      
      This problem was originally seen as DB corruption of shared global areas
      after a huge page was soft offlined due to ECC memory errors.  DB
      developers noticed they could reproduce the issue by (hotplug) offlining
      memory used to back huge pages.  A simple testcase can reproduce the
      problem by creating a shared PMD mapping (note that this must be at least
      PUD_SIZE in size and PUD_SIZE aligned (1GB on x86)), and using
      migrate_pages() to migrate process pages between nodes while continually
      writing to the huge pages being migrated.
      
      To fix, have the try_to_unmap_one routine check for huge PMD sharing by
      calling huge_pmd_unshare for hugetlbfs huge pages.  If it is a shared
      mapping it will be 'unshared' which removes the page table entry and drops
      the reference on the PMD page.  After this, flush caches and TLB.
      
      mmu notifiers are called before locking page tables, but we can not be
      sure of PMD sharing until page tables are locked.  Therefore, check for
      the possibility of PMD sharing before locking so that notifiers can
      prepare for the worst possible case.
      
      Link: http://lkml.kernel.org/r/20180823205917.16297-2-mike.kravetz@oracle.com
      [mike.kravetz@oracle.com: make _range_in_vma() a static inline]
        Link: http://lkml.kernel.org/r/6063f215-a5c8-2f0c-465a-2c515ddc952d@oracle.com
      
      
      Fixes: 39dde65c ("shared page table for hugetlb page")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      017b1660
  7. Jul 14, 2018
  8. Apr 21, 2018
    • Naoya Horiguchi's avatar
      mm: enable thp migration for shmem thp · e71769ae
      Naoya Horiguchi authored
      My testing for the latest kernel supporting thp migration showed an
      infinite loop in offlining the memory block that is filled with shmem
      thps.  We can get out of the loop with a signal, but kernel should return
      with failure in this case.
      
      What happens in the loop is that scan_movable_pages() repeats returning
      the same pfn without any progress.  That's because page migration always
      fails for shmem thps.
      
      In memory offline code, memory blocks containing unmovable pages should be
      prevented from being offline targets by has_unmovable_pages() inside
      start_isolate_page_range().  So it's possible to change migratability for
      non-anonymous thps to avoid the issue, but it introduces more complex and
      thp-specific handling in migration code, so it might not good.
      
      So this patch is suggesting to fix the issue by enabling thp migration for
      shmem thp.  Both of anon/shmem thp are migratable so we don't need
      precheck about the type of thps.
      
      Link: http://lkml.kernel.org/r/20180406030706.GA2434@hori1.linux.bs1.fc.nec.co.jp
      
      
      Fixes: commit 72b39cfc ("mm, memory_hotplug: do not fail offlining too early")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Zi Yan <zi.yan@sent.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e71769ae
  9. Apr 16, 2018
  10. Apr 11, 2018
  11. Apr 06, 2018
  12. Mar 18, 2018
  13. Nov 16, 2017
    • Mel Gorman's avatar
      mm: remove cold parameter from free_hot_cold_page* · 2d4894b5
      Mel Gorman authored
      Most callers users of free_hot_cold_page claim the pages being released
      are cache hot.  The exception is the page reclaim paths where it is
      likely that enough pages will be freed in the near future that the
      per-cpu lists are going to be recycled and the cache hotness information
      is lost.  As no one really cares about the hotness of pages being
      released to the allocator, just ditch the parameter.
      
      The APIs are renamed to indicate that it's no longer about hot/cold
      pages.  It should also be less confusing as there are subtle differences
      between them.  __free_pages drops a reference and frees a page when the
      refcount reaches zero.  free_hot_cold_page handled pages whose refcount
      was already zero which is non-obvious from the name.  free_unref_page
      should be more obvious.
      
      No performance impact is expected as the overhead is marginal.  The
      parameter is removed simply because it is a bit stupid to have a useless
      parameter copied everywhere.
      
      [mgorman@techsingularity.net: add pages to head, not tail]
        Link: http://lkml.kernel.org/r/20171019154321.qtpzaeftoyyw4iey@techsingularity.net
      Link: http://lkml.kernel.org/r/20171018075952.10627-8-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2d4894b5
    • Colin Ian King's avatar
      mm/rmap.c: remove redundant variable cend · cdb07bde
      Colin Ian King authored
      Variable cend is set but never read, hence it is redundant and can be
      removed.
      
      Cleans up clang build warning: Value stored to 'cend' is never read
      
      Link: http://lkml.kernel.org/r/20171011174942.1372-1-colin.king@canonical.com
      
      
      Fixes: 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cdb07bde
    • Jérôme Glisse's avatar
      mm/mmu_notifier: avoid double notification when it is useless · 0f10851e
      Jérôme Glisse authored
      This patch only affects users of mmu_notifier->invalidate_range callback
      which are device drivers related to ATS/PASID, CAPI, IOMMUv2, SVM ...
      and it is an optimization for those users.  Everyone else is unaffected
      by it.
      
      When clearing a pte/pmd we are given a choice to notify the event under
      the page table lock (notify version of *_clear_flush helpers do call the
      mmu_notifier_invalidate_range).  But that notification is not necessary
      in all cases.
      
      This patch removes almost all cases where it is useless to have a call
      to mmu_notifier_invalidate_range before
      mmu_notifier_invalidate_range_end.  It also adds documentation in all
      those cases explaining why.
      
      Below is a more in depth analysis of why this is fine to do this:
      
      For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when
      device use thing like ATS/PASID to get the IOMMU to walk the CPU page
      table to access a process virtual address space).  There is only 2 cases
      when you need to notify those secondary TLB while holding page table
      lock when clearing a pte/pmd:
      
        A) page backing address is free before mmu_notifier_invalidate_range_end
        B) a page table entry is updated to point to a new page (COW, write fault
           on zero page, __replace_page(), ...)
      
      Case A is obvious you do not want to take the risk for the device to write
      to a page that might now be used by something completely different.
      
      Case B is more subtle. For correctness it requires the following sequence
      to happen:
        - take page table lock
        - clear page table entry and notify (pmd/pte_huge_clear_flush_notify())
        - set page table entry to point to new page
      
      If clearing the page table entry is not followed by a notify before setting
      the new pte/pmd value then you can break memory model like C11 or C++11 for
      the device.
      
      Consider the following scenario (device use a feature similar to ATS/
      PASID):
      
      Two address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we
      assume they are write protected for COW (other case of B apply too).
      
      [Time N] -----------------------------------------------------------------
      CPU-thread-0  {try to write to addrA}
      CPU-thread-1  {try to write to addrB}
      CPU-thread-2  {}
      CPU-thread-3  {}
      DEV-thread-0  {read addrA and populate device TLB}
      DEV-thread-2  {read addrB and populate device TLB}
      [Time N+1] ---------------------------------------------------------------
      CPU-thread-0  {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}}
      CPU-thread-1  {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}}
      CPU-thread-2  {}
      CPU-thread-3  {}
      DEV-thread-0  {}
      DEV-thread-2  {}
      [Time N+2] ---------------------------------------------------------------
      CPU-thread-0  {COW_step1: {update page table point to new page for addrA}}
      CPU-thread-1  {COW_step1: {update page table point to new page for addrB}}
      CPU-thread-2  {}
      CPU-thread-3  {}
      DEV-thread-0  {}
      DEV-thread-2  {}
      [Time N+3] ---------------------------------------------------------------
      CPU-thread-0  {preempted}
      CPU-thread-1  {preempted}
      CPU-thread-2  {write to addrA which is a write to new page}
      CPU-thread-3  {}
      DEV-thread-0  {}
      DEV-thread-2  {}
      [Time N+3] ---------------------------------------------------------------
      CPU-thread-0  {preempted}
      CPU-thread-1  {preempted}
      CPU-thread-2  {}
      CPU-thread-3  {write to addrB which is a write to new page}
      DEV-thread-0  {}
      DEV-thread-2  {}
      [Time N+4] ---------------------------------------------------------------
      CPU-thread-0  {preempted}
      CPU-thread-1  {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}}
      CPU-thread-2  {}
      CPU-thread-3  {}
      DEV-thread-0  {}
      DEV-thread-2  {}
      [Time N+5] ---------------------------------------------------------------
      CPU-thread-0  {preempted}
      CPU-thread-1  {}
      CPU-thread-2  {}
      CPU-thread-3  {}
      DEV-thread-0  {read addrA from old page}
      DEV-thread-2  {read addrB from new page}
      
      So here because at time N+2 the clear page table entry was not pair with a
      notification to invalidate the secondary TLB, the device see the new value
      for addrB before seing the new value for addrA.  This break total memory
      ordering for the device.
      
      When changing a pte to write protect or to point to a new write protected
      page with same content (KSM) it is ok to delay invalidate_range callback
      to mmu_notifier_invalidate_range_end() outside the page table lock.  This
      is true even if the thread doing page table update is preempted right
      after releasing page table lock before calling
      mmu_notifier_invalidate_range_end
      
      Thanks to Andrea for thinking of a problematic scenario for COW.
      
      [jglisse@redhat.com: v2]
        Link: http://lkml.kernel.org/r/20171017031003.7481-2-jglisse@redhat.com
      Link: http://lkml.kernel.org/r/20170901173011.10745-1-jglisse@redhat.com
      
      
      Signed-off-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Alistair Popple <alistair@popple.id.au>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f10851e
  14. Sep 09, 2017
  15. Aug 31, 2017
    • Jérôme Glisse's avatar
      mm/rmap: update to new mmu_notifier semantic v2 · 369ea824
      Jérôme Glisse authored
      
      Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range()
      and make sure it is bracketed by calls to *_invalidate_range_start()/end().
      
      Note that because we can not presume the pmd value or pte value we have
      to assume the worst and unconditionaly report an invalidation as
      happening.
      
      Changed since v2:
        - try_to_unmap_one() only one call to mmu_notifier_invalidate_range()
        - compute end with PAGE_SIZE << compound_order(page)
        - fix PageHuge() case in try_to_unmap_one()
      
      Signed-off-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Bernhard Held <berny156@gmx.de>
      Cc: Adam Borowski <kilobyte@angband.pl>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: axie <axie@amd.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      369ea824
  16. Aug 29, 2017
    • Linus Torvalds's avatar
      Revert "rmap: do not call mmu_notifier_invalidate_page() under ptl" · 785373b4
      Linus Torvalds authored
      
      This reverts commit aac2fea9.
      
      It turns out that that patch was complete and utter garbage, and broke
      KVM, resulting in odd oopses.
      
      Quoting Andrea Arcangeli:
       "The aforementioned commit has 3 bugs.
      
        1) mmu_notifier_invalidate_range cannot be used in replacement of
           mmu_notifier_invalidate_range_start/end.
      
           For KVM mmu_notifier_invalidate_range is a noop and rightfully so.
      
           A MMU notifier implementation has to implement either
           ->invalidate_range method or the invalidate_range_start/end
           methods, not both. And if you implement invalidate_range_start/end
           like KVM is forced to do, calling mmu_notifier_invalidate_range in
           common code is a noop for KVM.
      
           For those MMU notifiers that can get away only implementing
           ->invalidate_range, the ->invalidate_range is implicitly called by
           mmu_notifier_invalidate_range_end(). And only those secondary MMUs
           that share the same pagetable with the primary MMU (like AMD
           iommuv2) can get away only implementing ->invalidate_range.
      
           So all cases (THP on/off) are broken right now.
      
           To fix this is enough to replace mmu_notifier_invalidate_range with
           mmu_notifier_invalidate_range_start;mmu_notifier_invalidate_range_end.
           Either that or call multiple mmu_notifier_invalidate_page like
           before.
      
        2) address + (1UL << compound_order(page) is buggy, it should be
           PAGE_SIZE << compound_order(page), it's bytes not pages, 2M not
           512.
      
        3) The whole invalidate_range thing was an attempt to call a single
           invalidate while walking multiple 4k ptes that maps the same THP
           (after a pmd virtual split without physical compound page THP
           split).
      
           It's unclear if the rmap_walk will always provide an address that
           is 2M aligned as parameter to try_to_unmap_one, in presence of THP.
           I think it needs also an address &= (PAGE_SIZE <<
           compound_order(page)) - 1 to be safe"
      
      In general, we should stop making excuses for horrible MMU notifier
      users.  It's much more important that the core VM is sane and safe, than
      letting MMU notifiers sleep.
      
      So if some MMU notifier is sleeping under a spinlock, we need to fix the
      notifier, not try to make excuses for that garbage in the core VM.
      
      Reported-and-tested-by: default avatarBernhard Held <berny156@gmx.de>
      Reported-and-tested-by: default avatarAdam Borowski <kilobyte@angband.pl>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Cc: axie <axie@amd.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      785373b4
  17. Aug 10, 2017
  18. Aug 02, 2017
    • Mel Gorman's avatar
      mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries · 3ea27719
      Mel Gorman authored
      Nadav Amit identified a theoritical race between page reclaim and
      mprotect due to TLB flushes being batched outside of the PTL being held.
      
      He described the race as follows:
      
              CPU0                            CPU1
              ----                            ----
                                              user accesses memory using RW PTE
                                              [PTE now cached in TLB]
              try_to_unmap_one()
              ==> ptep_get_and_clear()
              ==> set_tlb_ubc_flush_pending()
                                              mprotect(addr, PROT_READ)
                                              ==> change_pte_range()
                                              ==> [ PTE non-present - no flush ]
      
                                              user writes using cached RW PTE
              ...
      
              try_to_unmap_flush()
      
      The same type of race exists for reads when protecting for PROT_NONE and
      also exists for operations that can leave an old TLB entry behind such
      as munmap, mremap and madvise.
      
      For some operations like mprotect, it's not necessarily a data integrity
      issue but it is a correctness issue as there is a window where an
      mprotect that limits access still allows access.  For munmap, it's
      potentially a data integrity issue although the race is massive as an
      munmap, mmap and return to userspace must all complete between the
      window when reclaim drops the PTL and flushes the TLB.  However, it's
      theoritically possible so handle this issue by flushing the mm if
      reclaim is potentially currently batching TLB flushes.
      
      Other instances where a flush is required for a present pte should be ok
      as either the page lock is held preventing parallel reclaim or a page
      reference count is elevated preventing a parallel free leading to
      corruption.  In the case of page_mkclean there isn't an obvious path
      that userspace could take advantage of without using the operations that
      are guarded by this patch.  Other users such as gup as a race with
      reclaim looks just at PTEs.  huge page variants should be ok as they
      don't race with reclaim.  mincore only looks at PTEs.  userfault also
      should be ok as if a parallel reclaim takes place, it will either fault
      the page back in or read some of the data before the flush occurs
      triggering a fault.
      
      Note that a variant of this patch was acked by Andy Lutomirski but this
      was for the x86 parts on top of his PCID work which didn't make the 4.13
      merge window as expected.  His ack is dropped from this version and
      there will be a follow-on patch on top of PCID that will include his
      ack.
      
      [akpm@linux-foundation.org: tweak comments]
      [akpm@linux-foundation.org: fix spello]
      Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
      
      
      Reported-by: default avatarNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: <stable@vger.kernel.org>	[v4.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ea27719
  19. Jul 06, 2017
  20. May 24, 2017
    • Andy Lutomirski's avatar
      mm, x86/mm: Make the batched unmap TLB flush API more generic · e73ad5ff
      Andy Lutomirski authored
      
      try_to_unmap_flush() used to open-code a rather x86-centric flush
      sequence: local_flush_tlb() + flush_tlb_others().  Rearrange the
      code so that the arch (only x86 for now) provides
      arch_tlbbatch_add_mm() and arch_tlbbatch_flush() and the core code
      calls those functions instead.
      
      I'll want this for x86 because, to enable address space ids, I can't
      support the flush_tlb_others() mode used by exising
      try_to_unmap_flush() implementation with good performance.  I can
      support the new API fairly easily, though.
      
      I imagine that other architectures may be in a similar position.
      Architectures with strong remote flush primitives (arm64?) may have
      even worse performance problems with flush_tlb_others() the way that
      try_to_unmap_flush() uses it.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/19f25a8581f9fb77876b7ff3b001f89835e34ea3.1495492063.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e73ad5ff
  21. May 03, 2017
Loading