Skip to content
  • Jann Horn's avatar
    sched/fair: Don't free p->numa_faults with concurrent readers · 16d51a59
    Jann Horn authored
    
    
    When going through execve(), zero out the NUMA fault statistics instead of
    freeing them.
    
    During execve, the task is reachable through procfs and the scheduler. A
    concurrent /proc/*/sched reader can read data from a freed ->numa_faults
    allocation (confirmed by KASAN) and write it back to userspace.
    I believe that it would also be possible for a use-after-free read to occur
    through a race between a NUMA fault and execve(): task_numa_fault() can
    lead to task_numa_compare(), which invokes task_weight() on the currently
    running task of a different CPU.
    
    Another way to fix this would be to make ->numa_faults RCU-managed or add
    extra locking, but it seems easier to wipe the NUMA fault statistics on
    execve.
    
    Signed-off-by: default avatarJann Horn <jannh@google.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Will Deacon <will@kernel.org>
    Fixes: 82727018 ("sched/numa: Call task_numa_free() from do_execve()")
    Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com
    
    
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    16d51a59