Skip to content
Snippets Groups Projects
Commit 553ec5a8 authored by Gabriel Krisman Bertazi's avatar Gabriel Krisman Bertazi Committed by Muhammad Usama Anjum
Browse files

mm: Implement process_memwatch syscall


This syscall can be used to watch the process's memory and perform
atomic operations which aren't possible through procfs. Two operations
have been implemnted. MEMWATCH_SD_GET_RANGE is used to find the
soft dirty pages and MEMWATCH_SD_GET_AND_CLEAR_RANGE finds and clears
the soft dirty bit as well.

Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
Co-developed-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: default avatarMuhammad Usama Anjum <usama.anjum@collabora.com>
---
Changelog:
Changes in v6:
- Bring back hugepage split
- Add performane flag to ignore VM_SOFTDIRTY entirely

Changes in v5:
- Take correct locks for CLEAR operations
- Fix not flushing tlb in case of MEMWATCH_SD_CLEAR op
- Fix VMA splitting

Changes in v4:
- Add MEMWATCH_SD_CLEAR_RANGE op
- Remove VMA_SOFTDIRTY related bug
- Update interface
- WIP: Break huge pages to smaller pages when they are found

Changes in v3:
- Change the interface to return the number of dirty pages

Changes in v2:
- Do wiring of syscall for 32-bit userspace application
- Update data type of vec buffer to u64 explicitly

NAME
       process_memwatch - get process's memory information

SYNOPSIS
       #include <linux/memwatch.h>   /* Definition of MEMWATCH_* constan
ts */

       long process_memwatch(int pidfd, unsigned long start, int len,
                             int op, void *vec, int vec_len);

       Note:  Glibc  does  not provide a wrapper for this system call;
       call it using syscall(2).

DESCRIPTION
       process_memwatch() system call is used to get information about
       the memory of the process.

   Arguments
       pidfd  specifies  the  pid  of process whose memory needs to be
       watched. The calling process must have ptrace capabilities over
       the  process whose pid has been specified. It can be zero which
       means that the process wants to watch its own memory. The oper‐
       ation  is  determined  by  op.  Memory is watched starting from
       start upto len.

       vec is output array in which the offsets of the pages  are  re‐
       turned.  Offset is calculated from start address. User lets the
       kernel know about the size  of  the  vec  by  passing  size  in
       vec_len.   The  system  call  returns when the whole range from
       start until len has been searched or vec is completely filled.

   Operations
       The op argument spcifies the operation to  be  performed.  Only
       one operation should be specified at a time.

       MEMWATCH_SD_GET_RANGE
              Get the page offsets which are soft dirty.

       MEMWATCH_SD_GET_AND_CLEAR_RANGE
              Get offsets and clear the pages which are soft dirty.

       MEMWATCH_SD_CLEAR_RANGE
              Clear the pages which are soft dirty.

       MEMWATCH_SD_PERFORMANCE
              This  optional  op  can be specified in combination with
              other ops. VM_SOFTDIRTY is ignored for the VMAs for per‐
              forance  reasons.  This  flag will show only those pages
              dirty which have been written by the user. All new allo‐
              cations will not be returned as dirty.

RETURN VALUE
       0  is  returned on success.  Positive value when returned shows
       the number of dirty pages filled in vec.  In the  event  of  an
       error  (and  assuming  that  process_memwatch() was invoked via
       syscall(2)), all operations return -1 and set errno to indicate
       the error.

ERRORS
       EINVAL invalid arguments.

       ERANGE start  or  vec  points to invalid or inaccessible memory
              location.

       ESRCH  Cannot access the process.
parent efe70864
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment