Skip to content
Snippets Groups Projects
  1. Mar 12, 2019
  2. Mar 09, 2019
    • Qian Cai's avatar
      workqueue, lockdep: Fix a memory leak in wq->lock_name · 69a106c0
      Qian Cai authored
      
      The following commit:
      
        669de8bd ("kernel/workqueue: Use dynamic lockdep keys for workqueues")
      
      introduced a memory leak as wq_free_lockdep() calls kfree(wq->lock_name),
      but wq_init_lockdep() does not point wq->lock_name to the newly allocated
      slab object.
      
      This can be reproduced by running LTP fallocate04 followed by oom01 tests:
      
       unreferenced object 0xc0000005876384d8 (size 64):
        comm "fallocate04", pid 26972, jiffies 4297139141 (age 40370.480s)
        hex dump (first 32 bytes):
          28 77 71 5f 63 6f 6d 70 6c 65 74 69 6f 6e 29 65  (wq_completion)e
          78 74 34 2d 72 73 76 2d 63 6f 6e 76 65 72 73 69  xt4-rsv-conversi
        backtrace:
          [<00000000cb452883>] kvasprintf+0x6c/0xe0
          [<000000004654ddac>] kasprintf+0x34/0x60
          [<000000001c68f311>] alloc_workqueue+0x1f8/0x6ac
          [<0000000003c2ad83>] ext4_fill_super+0x23d4/0x3c80 [ext4]
          [<0000000006610538>] mount_bdev+0x25c/0x290
          [<00000000bcf955ec>] ext4_mount+0x28/0x50 [ext4]
          [<0000000016e08fd3>] legacy_get_tree+0x4c/0xb0
          [<0000000042b6a5fc>] vfs_get_tree+0x6c/0x190
          [<00000000268ab022>] do_mount+0xb9c/0x1100
          [<00000000698e6898>] ksys_mount+0x158/0x180
          [<0000000064e391fd>] sys_mount+0x20/0x30
          [<00000000ba378f12>] system_call+0x5c/0x70
      
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: catalin.marinas@arm.com
      Cc: jiangshanlai@gmail.com
      Cc: tj@kernel.org
      Fixes: 669de8bd ("kernel/workqueue: Use dynamic lockdep keys for workqueues")
      Link: https://lkml.kernel.org/r/20190307002731.47371-1-cai@lca.pw
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      69a106c0
    • Bart Van Assche's avatar
      workqueue, lockdep: Fix an alloc_workqueue() error path · 009bb421
      Bart Van Assche authored
      
      This patch fixes a use-after-free and a memory leak in an alloc_workqueue()
      error path.
      
      Repoted by syzkaller and KASAN:
      
        BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:197 [inline]
        BUG: KASAN: use-after-free in lockdep_register_key+0x3b9/0x490 kernel/locking/lockdep.c:1023
        Read of size 8 at addr ffff888090fc2698 by task syz-executor134/7858
      
        CPU: 1 PID: 7858 Comm: syz-executor134 Not tainted 5.0.0-rc8-next-20190301 #1
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x172/0x1f0 lib/dump_stack.c:113
         print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
         kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
         __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
         __read_once_size include/linux/compiler.h:197 [inline]
         lockdep_register_key+0x3b9/0x490 kernel/locking/lockdep.c:1023
         wq_init_lockdep kernel/workqueue.c:3444 [inline]
         alloc_workqueue+0x427/0xe70 kernel/workqueue.c:4263
         ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732
         misc_open+0x398/0x4c0 drivers/char/misc.c:141
         chrdev_open+0x247/0x6b0 fs/char_dev.c:417
         do_dentry_open+0x488/0x1160 fs/open.c:771
         vfs_open+0xa0/0xd0 fs/open.c:880
         do_last fs/namei.c:3416 [inline]
         path_openat+0x10e9/0x46e0 fs/namei.c:3533
         do_filp_open+0x1a1/0x280 fs/namei.c:3563
         do_sys_open+0x3fe/0x5d0 fs/open.c:1063
         __do_sys_openat fs/open.c:1090 [inline]
         __se_sys_openat fs/open.c:1084 [inline]
         __x64_sys_openat+0x9d/0x100 fs/open.c:1084
         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        Allocated by task 7789:
         save_stack+0x45/0xd0 mm/kasan/common.c:75
         set_track mm/kasan/common.c:87 [inline]
         __kasan_kmalloc mm/kasan/common.c:497 [inline]
         __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
         kasan_kmalloc+0x9/0x10 mm/kasan/common.c:511
         __do_kmalloc mm/slab.c:3726 [inline]
         __kmalloc+0x15c/0x740 mm/slab.c:3735
         kmalloc include/linux/slab.h:553 [inline]
         kzalloc include/linux/slab.h:743 [inline]
         alloc_workqueue+0x13c/0xe70 kernel/workqueue.c:4236
         ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732
         misc_open+0x398/0x4c0 drivers/char/misc.c:141
         chrdev_open+0x247/0x6b0 fs/char_dev.c:417
         do_dentry_open+0x488/0x1160 fs/open.c:771
         vfs_open+0xa0/0xd0 fs/open.c:880
         do_last fs/namei.c:3416 [inline]
         path_openat+0x10e9/0x46e0 fs/namei.c:3533
         do_filp_open+0x1a1/0x280 fs/namei.c:3563
         do_sys_open+0x3fe/0x5d0 fs/open.c:1063
         __do_sys_openat fs/open.c:1090 [inline]
         __se_sys_openat fs/open.c:1084 [inline]
         __x64_sys_openat+0x9d/0x100 fs/open.c:1084
         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        Freed by task 7789:
         save_stack+0x45/0xd0 mm/kasan/common.c:75
         set_track mm/kasan/common.c:87 [inline]
         __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
         kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
         __cache_free mm/slab.c:3498 [inline]
         kfree+0xcf/0x230 mm/slab.c:3821
         alloc_workqueue+0xc3e/0xe70 kernel/workqueue.c:4295
         ucma_open+0x76/0x290 drivers/infiniband/core/ucma.c:1732
         misc_open+0x398/0x4c0 drivers/char/misc.c:141
         chrdev_open+0x247/0x6b0 fs/char_dev.c:417
         do_dentry_open+0x488/0x1160 fs/open.c:771
         vfs_open+0xa0/0xd0 fs/open.c:880
         do_last fs/namei.c:3416 [inline]
         path_openat+0x10e9/0x46e0 fs/namei.c:3533
         do_filp_open+0x1a1/0x280 fs/namei.c:3563
         do_sys_open+0x3fe/0x5d0 fs/open.c:1063
         __do_sys_openat fs/open.c:1090 [inline]
         __se_sys_openat fs/open.c:1084 [inline]
         __x64_sys_openat+0x9d/0x100 fs/open.c:1084
         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
        The buggy address belongs to the object at ffff888090fc2580
         which belongs to the cache kmalloc-512 of size 512
        The buggy address is located 280 bytes inside of
         512-byte region [ffff888090fc2580, ffff888090fc2780)
      
      Reported-by: default avatar <syzbot+17335689e239ce135d8b@syzkaller.appspotmail.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 669de8bd ("kernel/workqueue: Use dynamic lockdep keys for workqueues")
      Link: https://lkml.kernel.org/r/20190303220046.29448-1-bvanassche@acm.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      009bb421
    • Bart Van Assche's avatar
      locking/lockdep: Only call init_rcu_head() after RCU has been initialized · 0126574f
      Bart Van Assche authored
      
      init_data_structures_once() is called for the first time before RCU has
      been initialized. Make sure that init_rcu_head() is called before the
      RCU head is used and after RCU has been initialized.
      
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: longman@redhat.com
      Link: https://lkml.kernel.org/r/c20aa0f0-42ab-a884-d931-7d4ec2bf0cdc@acm.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0126574f
    • Arnd Bergmann's avatar
      locking/lockdep: Avoid a Clang warning · 3fe7522f
      Arnd Bergmann authored
      
      Clang warns about a tentative array definition without a length:
      
        kernel/locking/lockdep.c:845:12: error: tentative array definition assumed to have one element [-Werror]
      
      There is no real reason to do this here, so just set the same length as
      in the real definition later in the same file.  It has to be hidden in
      an #ifdef or annotated __maybe_unused though, to avoid the unused-variable
      warning if CONFIG_PROVE_LOCKING is disabled.
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://lkml.kernel.org/r/20190307075222.3424524-1-arnd@arndb.de
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3fe7522f
    • Gustavo A. R. Silva's avatar
      perf/core: Mark expected switch fall-through · 43aa378b
      Gustavo A. R. Silva authored
      
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warning:
      
        kernel/events/core.c: In function ‘perf_event_parse_addr_filter’:
        kernel/events/core.c:9154:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
            kernel = 1;
            ~~~~~~~^~~
        kernel/events/core.c:9156:3: note: here
           case IF_SRC_FILEADDR:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
      
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190212205430.GA8446@embeddedor
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      43aa378b
    • Alexander Shishkin's avatar
      perf/ring_buffer: Use high order allocations for AUX buffers optimistically · 5768402f
      Alexander Shishkin authored
      
      Currently, the AUX buffer allocator will use high-order allocations
      for PMUs that don't support hardware scatter-gather chaining to ensure
      large contiguous blocks of pages, and always use an array of single
      pages otherwise.
      
      There is, however, a tangible performance benefit in using larger chunks
      of contiguous memory even in the latter case, that comes from not having
      to fetch the next page's address at every page boundary. In particular,
      a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
      penalty with a single multi-page output region in snapshot mode (no PMI)
      than with multiple single-page output regions, from ~6% down to ~4%. For
      the snapshot mode it does make a difference as it is intended to run over
      long periods of time.
      
      For this reason, change the allocation policy to always optimistically
      start with the highest possible order when allocating pages for the AUX
      buffer, desceding until the allocation succeeds or order zero allocation
      fails.
      
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190215114727.62648-2-alexander.shishkin@linux.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5768402f
  3. Mar 08, 2019
  4. Mar 07, 2019
  5. Mar 06, 2019
    • Joerg Roedel's avatar
      dma: Introduce dma_max_mapping_size() · 133d624b
      Joerg Roedel authored
      
      The function returns the maximum size that can be mapped
      using DMA-API functions. The patch also adds the
      implementation for direct DMA and a new dma_map_ops pointer
      so that other implementations can expose their limit.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      133d624b
    • Joerg Roedel's avatar
      swiotlb: Add is_swiotlb_active() function · 492366f7
      Joerg Roedel authored
      
      This function will be used from dma_direct code to determine
      the maximum segment size of a dma mapping.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      492366f7
    • Joerg Roedel's avatar
      swiotlb: Introduce swiotlb_max_mapping_size() · abe420bf
      Joerg Roedel authored
      
      The function returns the maximum size that can be remapped
      by the SWIOTLB implementation. This function will be later
      exposed to users through the DMA-API.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      abe420bf
    • Arnd Bergmann's avatar
      ipc: Fix building compat mode without sysvipc · 7e89a37c
      Arnd Bergmann authored
      
      As John Stultz noticed, my y2038 syscall series caused a link
      failure when CONFIG_SYSVIPC is disabled but CONFIG_COMPAT is
      enabled:
      
      arch/arm64/kernel/sys32.o:(.rodata+0x960): undefined reference to `__arm64_compat_sys_old_semctl'
      arch/arm64/kernel/sys32.o:(.rodata+0x980): undefined reference to `__arm64_compat_sys_old_msgctl'
      arch/arm64/kernel/sys32.o:(.rodata+0x9a0): undefined reference to `__arm64_compat_sys_old_shmctl'
      
      Add the missing entries in kernel/sys_ni.c for the new system
      calls.
      
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      7e89a37c
    • Johannes Weiner's avatar
      kernel: cgroup: add poll file operation · dc50537b
      Johannes Weiner authored
      Cgroup has a standardized poll/notification mechanism for waking all
      pollers on all fds when a filesystem node changes.  To allow polling for
      custom events, add a .poll callback that can override the default.
      
      This is in preparation for pollable cgroup pressure files which have
      per-fd trigger configurations.
      
      Link: http://lkml.kernel.org/r/20190124211518.244221-3-surenb@google.com
      
      
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc50537b
    • Mel Gorman's avatar
      mm, compaction: capture a page under direct compaction · 5e1f0f09
      Mel Gorman authored
      Compaction is inherently race-prone as a suitable page freed during
      compaction can be allocated by any parallel task.  This patch uses a
      capture_control structure to isolate a page immediately when it is freed
      by a direct compactor in the slow path of the page allocator.  The
      intent is to avoid redundant scanning.
      
                                           5.0.0-rc1              5.0.0-rc1
                                     selective-v3r17          capture-v3r19
      Amean     fault-both-1         0.00 (   0.00%)        0.00 *   0.00%*
      Amean     fault-both-3      2582.11 (   0.00%)     2563.68 (   0.71%)
      Amean     fault-both-5      4500.26 (   0.00%)     4233.52 (   5.93%)
      Amean     fault-both-7      5819.53 (   0.00%)     6333.65 (  -8.83%)
      Amean     fault-both-12     9321.18 (   0.00%)     9759.38 (  -4.70%)
      Amean     fault-both-18     9782.76 (   0.00%)    10338.76 (  -5.68%)
      Amean     fault-both-24    15272.81 (   0.00%)    13379.55 *  12.40%*
      Amean     fault-both-30    15121.34 (   0.00%)    16158.25 (  -6.86%)
      Amean     fault-both-32    18466.67 (   0.00%)    18971.21 (  -2.73%)
      
      Latency is only moderately affected but the devil is in the details.  A
      closer examination indicates that base page fault latency is reduced but
      latency of huge pages is increased as it takes creater care to succeed.
      Part of the "problem" is that allocation success rates are close to 100%
      even when under pressure and compaction gets harder
      
                                      5.0.0-rc1              5.0.0-rc1
                                selective-v3r17          capture-v3r19
      Percentage huge-3        96.70 (   0.00%)       98.23 (   1.58%)
      Percentage huge-5        96.99 (   0.00%)       95.30 (  -1.75%)
      Percentage huge-7        94.19 (   0.00%)       97.24 (   3.24%)
      Percentage huge-12       94.95 (   0.00%)       97.35 (   2.53%)
      Percentage huge-18       96.74 (   0.00%)       97.30 (   0.58%)
      Percentage huge-24       97.07 (   0.00%)       97.55 (   0.50%)
      Percentage huge-30       95.69 (   0.00%)       98.50 (   2.95%)
      Percentage huge-32       96.70 (   0.00%)       99.27 (   2.65%)
      
      And scan rates are reduced as expected by 6% for the migration scanner
      and 29% for the free scanner indicating that there is less redundant
      work.
      
      Compaction migrate scanned    20815362    19573286
      Compaction free scanned       16352612    11510663
      
      [mgorman@techsingularity.net: remove redundant check]
        Link: http://lkml.kernel.org/r/20190201143853.GH9565@techsingularity.net
      Link: http://lkml.kernel.org/r/20190118175136.31341-23-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: YueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e1f0f09
    • Matthew Wilcox's avatar
      mm: remove sysctl_extfrag_handler() · 6b7e5cad
      Matthew Wilcox authored
      sysctl_extfrag_handler() neglects to propagate the return value from
      proc_dointvec_minmax() to its caller.  It's a wrapper that doesn't need
      to exist, so just use proc_dointvec_minmax() directly.
      
      Link: http://lkml.kernel.org/r/20190104032557.3056-1-willy@infradead.org
      
      
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Reported-by: default avatarAditya Pakki <pakki001@umn.edu>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b7e5cad
    • Anshuman Khandual's avatar
      mm: replace all open encodings for NUMA_NO_NODE · 98fa15f3
      Anshuman Khandual authored
      Patch series "Replace all open encodings for NUMA_NO_NODE", v3.
      
      All these places for replacement were found by running the following
      grep patterns on the entire kernel code.  Please let me know if this
      might have missed some instances.  This might also have replaced some
      false positives.  I will appreciate suggestions, inputs and review.
      
      1. git grep "nid == -1"
      2. git grep "node == -1"
      3. git grep "nid = -1"
      4. git grep "node = -1"
      
      This patch (of 2):
      
      At present there are multiple places where invalid node number is
      encoded as -1.  Even though implicitly understood it is always better to
      have macros in there.  Replace these open encodings for an invalid node
      number with the global macro NUMA_NO_NODE.  This helps remove NUMA
      related assumptions like 'invalid node' from various places redirecting
      them to a common definition.
      
      Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
      
      
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[ixgbe]
      Acked-by: Jens Axboe <axboe@kernel.dk>			[mtip32xx]
      Acked-by: Vinod Koul <vkoul@kernel.org>			[dmaengine.c]
      Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
      Acked-by: Doug Ledford <dledford@redhat.com>		[drivers/infiniband]
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Hans Verkuil <hverkuil@xs4all.nl>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98fa15f3
    • David Hildenbrand's avatar
      PM/Hibernate: exclude all PageOffline() pages · abd02ac6
      David Hildenbrand authored
      The content of pages that are marked PG_offline is not of interest (e.g.
      inflated by a balloon driver), let's skip these pages.
      
      In saveable_highmem_page(), move the PageReserved() check to a new check
      along with the PageOffline() check to separate it from the swsusp
      checks.
      
      [david@redhat.com: v2]
        Link: http://lkml.kernel.org/r/20181122100627.5189-9-david@redhat.com
      Link: http://lkml.kernel.org/r/20181119101616.8901-9-david@redhat.com
      
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Christian Hansen <chansen3@cisco.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Julien Freche <jfreche@vmware.com>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Lianbo Jiang <lijiang@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Miles Chen <miles.chen@mediatek.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Pankaj gupta <pagupta@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Xavier Deguillard <xdeguillard@vmware.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abd02ac6
    • David Hildenbrand's avatar
      PM/Hibernate: use pfn_to_online_page() · 5b56db37
      David Hildenbrand authored
      Let's use pfn_to_online_page() instead of pfn_to_page() when checking
      for saveable pages to not save/restore offline memory sections.
      
      Link: http://lkml.kernel.org/r/20181119101616.8901-8-david@redhat.com
      
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Suggested-by: default avatarMichal Hocko <mhocko@kernel.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Christian Hansen <chansen3@cisco.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Julien Freche <jfreche@vmware.com>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Lianbo Jiang <lijiang@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Miles Chen <miles.chen@mediatek.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Pankaj gupta <pagupta@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Xavier Deguillard <xdeguillard@vmware.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b56db37
    • David Hildenbrand's avatar
      kexec: export PG_offline to VMCOREINFO · e04b742f
      David Hildenbrand authored
      Right now, pages inflated as part of a balloon driver will be dumped by
      dump tools like makedumpfile.  While XEN is able to check in the crash
      kernel whether a certain pfn is actuall backed by memory in the
      hypervisor (see xen_oldmem_pfn_is_ram) and optimize this case, dumps of
      other balloon inflated memory will essentially result in zero pages
      getting allocated by the hypervisor and the dump getting filled with
      this data.
      
      The allocation and reading of zero pages can directly be avoided if a
      dumping tool could know which pages only contain stale information not
      to be dumped.
      
      We now have PG_offline which can be (and already is by virtio-balloon)
      used for marking pages as logically offline.  Follow up patches will
      make use of this flag also in other balloon implementations.
      
      Let's export PG_offline via PAGE_OFFLINE_MAPCOUNT_VALUE, so makedumpfile
      can directly skip pages that are logically offline and the content
      therefore stale.
      
      Please note that this is also helpful for a problem we were seeing under
      Hyper-V: Dumping logically offline memory (pages kept fake offline while
      onlining a section via online_page_callback) would under some condicions
      result in a kernel panic when dumping them.
      
      Link: http://lkml.kernel.org/r/20181119101616.8901-4-david@redhat.com
      
      
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarDave Young <dyoung@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Lianbo Jiang <lijiang@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Christian Hansen <chansen3@cisco.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Julien Freche <jfreche@vmware.com>
      Cc: Kairui Song <kasong@redhat.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Miles Chen <miles.chen@mediatek.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Pankaj gupta <pagupta@redhat.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Xavier Deguillard <xdeguillard@vmware.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e04b742f
  6. Mar 05, 2019
Loading