Skip to content
Snippets Groups Projects
  1. May 25, 2023
    • Muhammad Usama Anjum's avatar
      selftests: mm: add pagemap ioctl tests · f45c57ca
      Muhammad Usama Anjum authored
      
      Add pagemap ioctl tests. Add several different types of tests to judge
      the correction of the interface.
      
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      ---
      Changes in v16:
      - Added yet more tests which is a randomization test case to catch the
        corner cases
      - Add reset by exclusive PM_SCAN_OP_WP as well
      
      Changes in v13:
      - Update tests and rebase Makefile
      
      Changes in v12:
      - Updates and add more memory type tests
      
      Changes in v11:
      - Rebase on top of next-20230216 and update tests
      
      Chages in v7:
      - Add and update all test cases
      
      Changes in v6:
      - Rename variables
      
      Changes in v4:
      - Updated all the tests to conform to new IOCTL
      
      Changes in v3:
      - Add another test to do sanity of flags
      
      Changes in v2:
      - Update the tests to use the ioctl interface instead of syscall
      ---
      TAP version 13
      1..92
      ok 1 sanity_tests_sd memory size must be valid
      ok 2 sanity_tests_sd output buffer must be specified
      ok 3 sanity_tests_sd output buffer size must be valid
      ok 4 sanity_tests_sd wrong flag specified
      ok 5 sanity_tests_sd flag has extra bits specified
      ok 6 sanity_tests_sd no selection mask is specified
      ok 7 sanity_tests_sd no return mask is specified
      ok 8 sanity_tests_sd wrong return mask specified
      ok 9 sanity_tests_sd mixture of correct and wrong flag
      ok 10 sanity_tests_sd PAGEMAP_BITS_ALL cannot be specified with PM_SCAN_OP_WP
      ok 11 sanity_tests_sd Clear area with larger vec size
      ok 12 sanity_tests_sd Repeated pattern of written and non-written pages
      ok 13 sanity_tests_sd Repeated pattern of written and non-written pages in parts
      ok 14 sanity_tests_sd Repeated pattern of written and non-written pages max_pages
      ok 15 sanity_tests_sd only get 2 written pages and clear them as well
      ok 16 sanity_tests_sd Two regions
      ok 17 sanity_tests_sd Smaller max_pages
      ok 18 Smaller vec 46 50
      ok 19 Page testing: all new pages must not be written (dirty)
      ok 20 Page testing: all pages must be written (dirty)
      ok 21 Page testing: all pages dirty other than first and the last one
      ok 22 Page testing: PM_SCAN_OP_WP
      ok 23 Page testing: only middle page dirty
      ok 24 Page testing: only two middle pages dirty
      ok 25 Large Page testing: all new pages must not be written (dirty)
      ok 26 Large Page testing: all pages must be written (dirty)
      ok 27 Large Page testing: all pages dirty other than first and the last one
      ok 28 Large Page testing: PM_SCAN_OP_WP
      ok 29 Large Page testing: only middle page dirty
      ok 30 Large Page testing: only two middle pages dirty
      ok 31 Huge page testing: all new pages must not be written (dirty)
      ok 32 Huge page testing: all pages must be written (dirty)
      ok 33 Huge page testing: all pages dirty other than first and the last one
      ok 34 Huge page testing: PM_SCAN_OP_WP
      ok 35 Huge page testing: only middle page dirty
      ok 36 Huge page testing: only two middle pages dirty
      ok 37 Hugetlb shmem testing: all new pages must not be written (dirty)
      ok 38 Hugetlb shmem testing: all pages must be written (dirty)
      ok 39 Hugetlb shmem testing: all pages dirty other than first and the last one
      ok 40 Hugetlb shmem testing: PM_SCAN_OP_WP
      ok 41 Hugetlb shmem testing: only middle page dirty
      ok 42 Hugetlb shmem testing: only two middle pages dirty
      ok 43 Hugetlb mem testing: all new pages must not be written (dirty)
      ok 44 Hugetlb mem testing: all pages must be written (dirty)
      ok 45 Hugetlb mem testing: all pages dirty other than first and the last one
      ok 46 Hugetlb mem testing: PM_SCAN_OP_WP
      ok 47 Hugetlb mem testing: only middle page dirty
      ok 48 Hugetlb mem testing: only two middle pages dirty
      ok 49 File memory testing: all new pages must not be written (dirty)
      ok 50 File memory testing: all pages must be written (dirty)
      ok 51 File memory testing: all pages dirty other than first and the last one
      ok 52 File memory testing: PM_SCAN_OP_WP
      ok 53 File memory testing: only middle page dirty
      ok 54 File memory testing: only two middle pages dirty
      ok 55 File anonymous memory testing: all new pages must not be written (dirty)
      ok 56 File anonymous memory testing: all pages must be written (dirty)
      ok 57 File anonymous memory testing: all pages dirty other than first and the last one
      ok 58 File anonymous memory testing: PM_SCAN_OP_WP
      ok 59 File anonymous memory testing: only middle page dirty
      ok 60 File anonymous memory testing: only two middle pages dirty
      ok 61 hpage_unit_tests all new huge page must not be written (dirty)
      ok 62 hpage_unit_tests all the huge page must not be written
      ok 63 hpage_unit_tests all the huge page must be written and clear
      ok 64 hpage_unit_tests only middle page written
      ok 65 hpage_unit_tests clear first half of huge page
      ok 66 hpage_unit_tests clear first half of huge page with limited buffer
      ok 67 hpage_unit_tests clear second half huge page
      ok 68 hpage_unit_tests get half huge page
      ok 69 hpage_unit_tests get half huge page
      ok 70 Test test_simple
      ok 71 mprotect_tests Both pages written
      ok 72 mprotect_tests Both pages are not written (dirty)
      ok 73 mprotect_tests Both pages written after remap and mprotect
      ok 74 mprotect_tests Clear and make the pages written
      ok 75 transact_test count 192
      ok 76 transact_test count 0
      ok 77 transact_test Extra pages 0 (0.0%), extra thread faults 0.
      ok 78 sanity_tests clear op can only be specified with PAGE_IS_WRITTEN
      ok 79 sanity_tests required_mask specified
      ok 80 sanity_tests anyof_mask specified
      ok 81 sanity_tests excluded_mask specified
      ok 82 sanity_tests required_mask and anyof_mask specified
      ok 83 sanity_tests Get sd and present pages with anyof_mask
      ok 84 sanity_tests Get all the pages with required_mask
      ok 85 sanity_tests Get sd and present pages with required_mask and anyof_mask
      ok 86 sanity_tests Don't get sd pages
      ok 87 sanity_tests Don't get present pages
      ok 88 sanity_tests Find written present pages with return mask
      ok 89 sanity_tests Memory mapped file
      ok 90 sanity_tests Read/write to private memory mapped file
      ok 91 unmapped_region_tests Get status of pages
      ok 92 userfaultfd_tests all new pages must not be written (dirty)
       # Totals: pass:92 fail:0 xfail:0 xpass:0 skip:0 error:0
      f45c57ca
    • Muhammad Usama Anjum's avatar
      mm/pagemap: add documentation of PAGEMAP_SCAN IOCTL · c09e3d61
      Muhammad Usama Anjum authored
      
      Add some explanation and method to use write-protection and written-to
      on memory range.
      
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      ---
      Changes in v16:
      - Update the documentation
      
      Changes in v11:
      - Add more documentation
      c09e3d61
    • Muhammad Usama Anjum's avatar
      tools headers UAPI: Update linux/fs.h with the kernel sources · 71817f1f
      Muhammad Usama Anjum authored
      
      New IOCTL and macros has been added in the kernel sources. Update the
      tools header file as well.
      
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      71817f1f
    • Muhammad Usama Anjum's avatar
      fs/proc/task_mmu: Implement IOCTL to get and optionally clear info about PTEs · 58f5a7e8
      Muhammad Usama Anjum authored
      
      This IOCTL, PAGEMAP_SCAN on pagemap file can be used to get and/or clear
      the info about page table entries. The following operations are supported
      in this ioctl:
      - Get the information if the pages have been written-to (PAGE_IS_WRITTEN),
        file mapped (PAGE_IS_FILE), present (PAGE_IS_PRESENT) or swapped
        (PAGE_IS_SWAPPED).
      - Find pages which have been written-to and/or write protect the pages
        (atomic PM_SCAN_OP_GET + PM_SCAN_OP_WP)
      
      This IOCTL can be extended to get information about more PTE bits.
      
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      ---
      Changes in v16:
      - Fixed a corner case where kernel writes beyond user buffer by one
        element
      - Bring back exclusive PM_SCAN_OP_WP
      - Cosmetic changes
      
      Changes in v15:
      - Build fix:
        - Use generic tlb flush function in pagemap_scan_pmd_entry() instead of
          using x86 specific flush function in do_pagemap_scan()
        - Remove #ifdef from pagemap_scan_hugetlb_entry()
        - Use mm instead of undefined vma->vm_mm
      
      Changes in v14:
      - Fix build error caused by #ifdef added at last minute in some configs
      
      Changes in v13:
      - Review updates
      - mmap_read_lock_killable() instead of mmap_read_lock()
      - Replace uffd_wp_range() with helpers which increases performance
        drastically for OP_WP operations by reducing the number of tlb
        flushing etc
      - Add MMU_NOTIFY_PROTECTION_VMA notification for the memory range
      
      Changes in v12:
      - Add hugetlb support to cover all memory types
      - Merge "userfaultfd: Define dummy uffd_wp_range()" with this patch
      - Review updates to the code
      
      Changes in v11:
      - Find written pages in a better way
      - Fix a corner case (thanks Paul)
      - Improve the code/comments
      - remove ENGAGE_WP + ! GET operation
      - shorten the commit message in favour of moving documentation to
        pagemap.rst
      
      Changes in v10:
      - move changes in tools/include/uapi/linux/fs.h to separate patch
      - update commit message
      
      Change in v8:
      - Correct is_pte_uffd_wp()
      - Improve readability and error checks
      - Remove some un-needed code
      
      Changes in v7:
      - Rebase on top of latest next
      - Fix some corner cases
      - Base soft-dirty on the uffd wp async
      - Update the terminologies
      - Optimize the memory usage inside the ioctl
      
      Changes in v6:
      - Rename variables and update comments
      - Make IOCTL independent of soft_dirty config
      - Change masks and bitmap type to _u64
      - Improve code quality
      
      Changes in v5:
      - Remove tlb flushing even for clear operation
      
      Changes in v4:
      - Update the interface and implementation
      
      Changes in v3:
      - Tighten the user-kernel interface by using explicit types and add more
        error checking
      
      Changes in v2:
      - Convert the interface from syscall to ioctl
      - Remove pidfd support as it doesn't make sense in ioctl
      58f5a7e8
  2. Apr 17, 2023
    • Peter Xu's avatar
      userfaultfd: UFFD_FEATURE_WP_ASYNC · 7b77ae93
      Peter Xu authored and Muhammad Usama Anjum's avatar Muhammad Usama Anjum committed
      
      This patch adds a new userfaultfd-wp feature UFFD_FEATURE_WP_ASYNC, that
      allows userfaultfd wr-protect faults to be resolved by the kernel directly.
      
      It can be used like a high accuracy version of soft-dirty, without vma
      modifications during tracking, and also with ranged support by default
      rather than for a whole mm when reset the protections due to existence of
      ioctl(UFFDIO_WRITEPROTECT).
      
      Several goals of such a dirty tracking interface:
      
      1. All types of memory should be supported and tracable. This is nature
         for soft-dirty but should mention when the context is userfaultfd,
         because it used to only support anon/shmem/hugetlb. The problem is for
         a dirty tracking purpose these three types may not be enough, and it's
         legal to track anything e.g. any page cache writes from mmap.
      
      2. Protections can be applied to partial of a memory range, without vma
         split/merge fuss.  The hope is that the tracking itself should not
         affect any vma layout change.  It also helps when reset happens because
         the reset will not need mmap write lock which can block the tracee.
      
      3. Accuracy needs to be maintained.  This means we need pte markers to work
         on any type of VMA.
      
      One could question that, the whole concept of async dirty tracking is not
      really close to fundamentally what userfaultfd used to be: it's not "a
      fault to be serviced by userspace" anymore. However, using userfaultfd-wp
      here as a framework is convenient for us in at least:
      
      1. VM_UFFD_WP vma flag, which has a very good name to suite something like
         this, so we don't need VM_YET_ANOTHER_SOFT_DIRTY. Just use a new
         feature bit to identify from a sync version of uffd-wp registration.
      
      2. PTE markers logic can be leveraged across the whole kernel to maintain
         the uffd-wp bit as long as an arch supports, this also applies to this
         case where uffd-wp bit will be a hint to dirty information and it will
         not go lost easily (e.g. when some page cache ptes got zapped).
      
      3. Reuse ioctl(UFFDIO_WRITEPROTECT) interface for either starting or
         resetting a range of memory, while there's no counterpart in the old
         soft-dirty world, hence if this is wanted in a new design we'll need a
         new interface otherwise.
      
      We can somehow understand that commonality because uffd-wp was
      fundamentally a similar idea of write-protecting pages just like
      soft-dirty.
      
      This implementation allows WP_ASYNC to imply WP_UNPOPULATED, because so far
      WP_ASYNC seems to not usable if without WP_UNPOPULATE.  This also gives us
      chance to modify impl of WP_ASYNC just in case it could be not depending on
      WP_UNPOPULATED anymore in the future kernels. It's also fine to imply that
      because both features will rely on PTE_MARKER_UFFD_WP config option, so
      they'll show up together (or both missing) in an UFFDIO_API probe.
      
      vma_can_userfault() now allows any VMA if the userfaultfd registration is
      only about async uffd-wp. So we can track dirty for all kinds of memory
      including generic file systems (like XFS, EXT4 or BTRFS).
      
      One trick worth mention in do_wp_page() is that we need to manually update
      vmf->orig_pte here because it can be used later with a pte_same() check -
      this path always has FAULT_FLAG_ORIG_PTE_VALID set in the flags.
      
      The major defect of this approach of dirty tracking is we need to populate
      the pgtables when tracking starts. Soft-dirty doesn't do it like that.
      It's unwanted in the case where the range of memory to track is huge and
      unpopulated (e.g., tracking updates on a 10G file with mmap() on top,
      without having any page cache installed yet). One way to improve this is
      to allow pte markers exist for larger than PTE level for PMD+. That will
      not change the interface if to implemented, so we can leave that for later.
      
      Co-developed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      ---
      Changes in v12:
      - Peter added the hugetlb support and revamped some other implementation
      - Transferred the authorship to Peter
      - Merge documentation to this patch
      
      Changes in v11:
      - Fix return code in userfaultfd_register() and minor changes here and
        there
      - Rebase on top of next-20230307
      - Base patches on UFFD_FEATURE_WP_UNPOPULATED https://lore.kernel.org/all/20230306213925.617814-1-peterx@redhat.com
      - UFFD_FEATURE_WP_ASYNC depends on UFFD_FEATURE_WP_UNPOPULATED to work
        (correctly)
      
      Changes in v10:
      - Build fix
      - Update comments and add error condition to return error from uffd
        register if hugetlb pages are present when wp async flag is set
      
      Changes in v9:
      - Correct the fault resolution with code contributed by Peter
      
      Changes in v7:
      - Remove UFFDIO_WRITEPROTECT_MODE_ASYNC_WP and add UFFD_FEATURE_WP_ASYNC
      - Handle automatic page fault resolution in better way (thanks to Peter)
      7b77ae93
  3. Apr 14, 2023
Loading