Skip to content
Snippets Groups Projects
  1. Sep 09, 2017
    • Daniel Micay's avatar
      init/main.c: extract early boot entropy from the passed cmdline · 33d72f38
      Daniel Micay authored
      Feed the boot command-line as to the /dev/random entropy pool
      
      Existing Android bootloaders usually pass data which may not be known by
      an external attacker on the kernel command-line.  It may also be the
      case on other embedded systems.  Sample command-line from a Google Pixel
      running CopperheadOS....
      
          console=ttyHSL0,115200,n8 androidboot.console=ttyHSL0
          androidboot.hardware=sailfish user_debug=31 ehci-hcd.park=3
          lpm_levels.sleep_disabled=1 cma=32M@0-0xffffffff buildvariant=user
          veritykeyid=id:dfcb9db0089e5b3b4090a592415c28e1cb4545ab
          androidboot.bootdevice=624000.ufshc androidboot.verifiedbootstate=yellow
          androidboot.veritymode=enforcing androidboot.keymaster=1
          androidboot.serialno=FA6CE0305299 androidboot.baseband=msm
          mdss_mdp.panel=1:dsi:0:qcom,mdss_dsi_samsung_ea8064tg_1080p_cmd:1:none:cfg:single_dsi
          androidboot.slot_suffix=_b fpsimd.fpsimd_settings=0
          app_setting.use_app_setting=0 kernelflag=0x00000000 debugflag=0x00000000
          androidboot.hardware.revision=PVT radioflag=0x00000000
          radioflagex1=0x00000000 radioflagex2=0x00000000 cpumask=0x00000000
          androidboot.hardware.ddr=4096MB,Hynix,LPDDR4 androidboot.ddrinfo=00000006
          androidboot.ddrsize=4GB androidboot.hardware.color=GRA00
          androidboot.hardware.ufs=32GB,Samsung androidboot.msm.hw_ver_id=268824801
          androidboot.qf.st=2 androidboot.cid=11111111 androidboot.mid=G-2PW4100
          androidboot.bootloader=8996-012001-1704121145
          androidboot.oem_unlock_support=1 androidboot.fp_src=1
          androidboot.htc.hrdump=detected androidboot.ramdump.opt=mem@2g:2g,mem@4g:2g
          androidboot.bootreason=reboot androidboot.ramdump_enable=0 ro
          root=/dev/dm-0 dm="system none ro,0 1 android-verity /dev/sda34"
          rootwait skip_initramfs init=/init androidboot.wificountrycode=US
          androidboot.boottime=1BLL:85,1BLE:669,2BLL:0,2BLE:1777,SW:6,KL:8136
      
      Among other things, it contains a value unique to the device
      (androidboot.serialno=FA6CE0305299), unique to the OS builds for the
      device variant (veritykeyid=id:dfcb9db0089e5b3b4090a592415c28e1cb4545ab)
      and timings from the bootloader stages in milliseconds
      (androidboot.boottime=1BLL:85,1BLE:669,2BLL:0,2BLE:1777,SW:6,KL:8136).
      
      [tytso@mit.edu: changelog tweak]
      [labbott@redhat.com: line-wrapped command line]
      Link: http://lkml.kernel.org/r/20170816231458.2299-3-labbott@redhat.com
      
      
      Signed-off-by: default avatarDaniel Micay <danielmicay@gmail.com>
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Nick Kralevich <nnk@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33d72f38
    • Laura Abbott's avatar
      init: move stack canary initialization after setup_arch · 121388a3
      Laura Abbott authored
      Patch series "Command line randomness", v3.
      
      A series to add the kernel command line as a source of randomness.
      
      This patch (of 2):
      
      Stack canary intialization involves getting a random number.  Getting this
      random number may involve accessing caches or other architectural specific
      features which are not available until after the architecture is setup.
      Move the stack canary initialization later to accommodate this.
      
      Link: http://lkml.kernel.org/r/20170816231458.2299-2-labbott@redhat.com
      
      
      Signed-off-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Nick Kralevich <nnk@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      121388a3
  2. Sep 07, 2017
    • Michal Hocko's avatar
      mm, memory_hotplug: drop zone from build_all_zonelists · 72675e13
      Michal Hocko authored
      build_all_zonelists gets a zone parameter to initialize zone's pagesets.
      There is only a single user which gives a non-NULL zone parameter and
      that one doesn't really need the rest of the build_all_zonelists (see
      commit 6dcd73d7 ("memory-hotplug: allocate zone's pcp before
      onlining pages")).
      
      Therefore remove setup_zone_pageset from build_all_zonelists and call it
      from its only user directly.  This will also remove a pointless zonlists
      rebuilding which is always good.
      
      Link: http://lkml.kernel.org/r/20170721143915.14161-5-mhocko@kernel.org
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      72675e13
    • Kees Cook's avatar
      mm: add SLUB free list pointer obfuscation · 2482ddec
      Kees Cook authored
      This SLUB free list pointer obfuscation code is modified from Brad
      Spengler/PaX Team's code in the last public patch of grsecurity/PaX
      based on my understanding of the code.  Changes or omissions from the
      original code are mine and don't reflect the original grsecurity/PaX
      code.
      
      This adds a per-cache random value to SLUB caches that is XORed with
      their freelist pointer address and value.  This adds nearly zero
      overhead and frustrates the very common heap overflow exploitation
      method of overwriting freelist pointers.
      
      A recent example of the attack is written up here:
      
        http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit
      
      and there is a section dedicated to the technique the book "A Guide to
      Kernel Exploitation: Attacking the Core".
      
      This is based on patches by Daniel Micay, and refactored to minimize the
      use of #ifdef.
      
      With 200-count cycles of "hackbench -g 20 -l 1000" I saw the following
      run times:
      
       before:
       	mean 10.11882499999999999995
      	variance .03320378329145728642
      	stdev .18221905304181911048
      
        after:
      	mean 10.12654000000000000014
      	variance .04700556623115577889
      	stdev .21680767106160192064
      
      The difference gets lost in the noise, but if the above is to be taken
      literally, using CONFIG_FREELIST_HARDENED is 0.07% slower.
      
      Link: http://lkml.kernel.org/r/20170802180609.GA66807@beast
      
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Suggested-by: default avatarDaniel Micay <danielmicay@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tycho Andersen <tycho@docker.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2482ddec
  3. Sep 04, 2017
  4. Aug 14, 2017
    • Waiman Long's avatar
      debugobjects: Make kmemleak ignore debug objects · caba4cbb
      Waiman Long authored
      
      The allocated debug objects are either on the free list or in the
      hashed bucket lists. So they won't get lost. However if both debug
      objects and kmemleak are enabled and kmemleak scanning is done
      while some of the debug objects are transitioning from one list to
      the others, false negative reporting of memory leaks may happen for
      those objects. For example,
      
      [38687.275678] kmemleak: 12 new suspected memory leaks (see
      /sys/kernel/debug/kmemleak)
      unreferenced object 0xffff92e98aabeb68 (size 40):
        comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff  ................
          01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff  ........86.a....
        backtrace:
          [<ffffffff8fa5378a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff8f47c019>] kmem_cache_alloc+0xe9/0x320
          [<ffffffff8f62ed96>] __debug_object_init+0x3e6/0x400
          [<ffffffff8f62ef01>] debug_object_activate+0x131/0x210
          [<ffffffff8f330d9f>] __call_rcu+0x3f/0x400
          [<ffffffff8f33117d>] call_rcu_sched+0x1d/0x20
          [<ffffffff8f4a183c>] put_object+0x2c/0x40
          [<ffffffff8f4a188c>] __delete_object+0x3c/0x50
          [<ffffffff8f4a18bd>] delete_object_full+0x1d/0x20
          [<ffffffff8fa535c2>] kmemleak_free+0x32/0x80
          [<ffffffff8f47af07>] kmem_cache_free+0x77/0x350
          [<ffffffff8f453912>] unlink_anon_vmas+0x82/0x1e0
          [<ffffffff8f440341>] free_pgtables+0xa1/0x110
          [<ffffffff8f44af91>] exit_mmap+0xc1/0x170
          [<ffffffff8f29db60>] mmput+0x80/0x150
          [<ffffffff8f2a7609>] do_exit+0x2a9/0xd20
      
      The references in the debug objects may also hide a real memory leak.
      
      As there is no point in having kmemleak to track debug object
      allocations, kmemleak checking is now disabled for debug objects.
      
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com
      caba4cbb
  5. Aug 10, 2017
  6. Aug 01, 2017
  7. Jul 26, 2017
    • Dennis Zhou (Facebook)'s avatar
      percpu: replace area map allocator with bitmap · 40064aec
      Dennis Zhou (Facebook) authored
      
      The percpu memory allocator is experiencing scalability issues when
      allocating and freeing large numbers of counters as in BPF.
      Additionally, there is a corner case where iteration is triggered over
      all chunks if the contig_hint is the right size, but wrong alignment.
      
      This patch replaces the area map allocator with a basic bitmap allocator
      implementation. Each subsequent patch will introduce new features and
      replace full scanning functions with faster non-scanning options when
      possible.
      
      Implementation:
      This patchset removes the area map allocator in favor of a bitmap
      allocator backed by metadata blocks. The primary goal is to provide
      consistency in performance and memory footprint with a focus on small
      allocations (< 64 bytes). The bitmap removes the heavy memmove from the
      freeing critical path and provides a consistent memory footprint. The
      metadata blocks provide a bound on the amount of scanning required by
      maintaining a set of hints.
      
      In an effort to make freeing fast, the metadata is updated on the free
      path if the new free area makes a page free, a block free, or spans
      across blocks. This causes the chunk's contig hint to potentially be
      smaller than what it could allocate by up to the smaller of a page or a
      block. If the chunk's contig hint is contained within a block, a check
      occurs and the hint is kept accurate. Metadata is always kept accurate
      on allocation, so there will not be a situation where a chunk has a
      later contig hint than available.
      
      Evaluation:
      I have primarily done testing against a simple workload of allocation of
      1 million objects (2^20) of varying size. Deallocation was done by in
      order, alternating, and in reverse. These numbers were collected after
      rebasing ontop of a80099a1. I present the worst-case numbers here:
      
        Area Map Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        310      |     4770
                   16B    |        557      |     1325
                   64B    |        436      |      273
                  256B    |        776      |      131
                 1024B    |       3280      |      122
      
        Bitmap Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        490      |       70
                   16B    |        515      |       75
                   64B    |        610      |       80
                  256B    |        950      |      100
                 1024B    |       3520      |      200
      
      This data demonstrates the inability for the area map allocator to
      handle less than ideal situations. In the best case of reverse
      deallocation, the area map allocator was able to perform within range
      of the bitmap allocator. In the worst case situation, freeing took
      nearly 5 seconds for 1 million 4-byte objects. The bitmap allocator
      dramatically improves the consistency of the free path. The small
      allocations performed nearly identical regardless of the freeing
      pattern.
      
      While it does add to the allocation latency, the allocation scenario
      here is optimal for the area map allocator. The area map allocator runs
      into trouble when it is allocating in chunks where the latter half is
      full. It is difficult to replicate this, so I present a variant where
      the pages are second half filled. Freeing was done sequentially. Below
      are the numbers for this scenario:
      
        Area Map Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |       4118      |     4892
                   16B    |       1651      |     1163
                   64B    |        598      |      285
                  256B    |        771      |      158
                 1024B    |       3034      |      160
      
        Bitmap Allocator:
      
              Object Size | Alloc Time (ms) | Free Time (ms)
              ----------------------------------------------
                    4B    |        481      |       67
                   16B    |        506      |       69
                   64B    |        636      |       75
                  256B    |        892      |       90
                 1024B    |       3262      |      147
      
      The data shows a parabolic curve of performance for the area map
      allocator. This is due to the memmove operation being the dominant cost
      with the lower object sizes as more objects are packed in a chunk and at
      higher object sizes, the traversal of the chunk slots is the dominating
      cost. The bitmap allocator suffers this problem as well. The above data
      shows the inability to scale for the allocation path with the area map
      allocator and that the bitmap allocator demonstrates consistent
      performance in general.
      
      The second problem of additional scanning can result in the area map
      allocator completing in 52 minutes when trying to allocate 1 million
      4-byte objects with 8-byte alignment. The same workload takes
      approximately 16 seconds to complete for the bitmap allocator.
      
      V2:
      Fixed a bug in pcpu_alloc_first_chunk end_offset was setting the bitmap
      using bytes instead of bits.
      
      Added a comment to pcpu_cnt_pop_pages to explain bitmap_weight.
      
      Signed-off-by: default avatarDennis Zhou <dennisszhou@gmail.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      40064aec
  8. Jul 18, 2017
    • Tom Lendacky's avatar
      x86, swiotlb: Add memory encryption support · c7753208
      Tom Lendacky authored
      
      Since DMA addresses will effectively look like 48-bit addresses when the
      memory encryption mask is set, SWIOTLB is needed if the DMA mask of the
      device performing the DMA does not support 48-bits. SWIOTLB will be
      initialized to create decrypted bounce buffers for use by these devices.
      
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Toshimitsu Kani <toshi.kani@hpe.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kvm@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/aa2d29b78ae7d508db8881e46a3215231b9327a7.1500319216.git.thomas.lendacky@amd.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c7753208
  9. Jul 17, 2017
    • David Howells's avatar
      VFS: Differentiate mount flags (MS_*) from internal superblock flags · e462ec50
      David Howells authored
      
      Differentiate the MS_* flags passed to mount(2) from the internal flags set
      in the super_block's s_flags.  s_flags are now called SB_*, with the names
      and the values for the moment mirroring the MS_* flags that they're
      equivalent to.
      
      In this patch, just the headers are altered and some kernel code where
      blind automated conversion isn't necessarily correct.
      
      Note that this shows up some interesting issues:
      
       (1) Some MS_* flags get translated to MNT_* flags (such as MS_NODEV ->
           MNT_NODEV) without passing this on to the filesystem, but some
           filesystems set such flags anyway.
      
       (2) The ->remount_fs() methods of some filesystems adjust the *flags
           argument by setting MS_* flags in it, such as MS_NOATIME - but these
           flags are then scrubbed by do_remount_sb() (only the occupants of
           MS_RMT_MASK are permitted: MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK,
           MS_I_VERSION and MS_LAZYTIME)
      
      I'm not sure what's the best way to solve all these cases.
      
      Suggested-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e462ec50
    • David Howells's avatar
      VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) · bc98a42c
      David Howells authored
      
      Firstly by applying the following with coccinelle's spatch:
      
      	@@ expression SB; @@
      	-SB->s_flags & MS_RDONLY
      	+sb_rdonly(SB)
      
      to effect the conversion to sb_rdonly(sb), then by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(!sb_rdonly(SB)) && A
      	+!sb_rdonly(SB) && A
      	|
      	-A != (sb_rdonly(SB))
      	+A != sb_rdonly(SB)
      	|
      	-A == (sb_rdonly(SB))
      	+A == sb_rdonly(SB)
      	|
      	-!(sb_rdonly(SB))
      	+!sb_rdonly(SB)
      	|
      	-A && (sb_rdonly(SB))
      	+A && sb_rdonly(SB)
      	|
      	-A || (sb_rdonly(SB))
      	+A || sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) != A
      	+sb_rdonly(SB) != A
      	|
      	-(sb_rdonly(SB)) == A
      	+sb_rdonly(SB) == A
      	|
      	-(sb_rdonly(SB)) && A
      	+sb_rdonly(SB) && A
      	|
      	-(sb_rdonly(SB)) || A
      	+sb_rdonly(SB) || A
      	)
      
      	@@ expression A, B, SB; @@
      	(
      	-(sb_rdonly(SB)) ? 1 : 0
      	+sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) ? A : B
      	+sb_rdonly(SB) ? A : B
      	)
      
      to remove left over excess bracketage and finally by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(A & MS_RDONLY) != sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) != sb_rdonly(SB)
      	|
      	-(A & MS_RDONLY) == sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) == sb_rdonly(SB)
      	)
      
      to make comparisons against the result of sb_rdonly() (which is a bool)
      work correctly.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      bc98a42c
  10. Jul 12, 2017
    • Kees Cook's avatar
      random: do not ignore early device randomness · ee7998c5
      Kees Cook authored
      The add_device_randomness() function would ignore incoming bytes if the
      crng wasn't ready.  This additionally makes sure to make an early enough
      call to add_latent_entropy() to influence the initial stack canary,
      which is especially important on non-x86 systems where it stays the same
      through the life of the boot.
      
      Link: http://lkml.kernel.org/r/20170626233038.GA48751@beast
      
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jessica Yu <jeyu@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Lokesh Vutla <lokeshvutla@ti.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee7998c5
  11. Jul 06, 2017
    • Kees Cook's avatar
      mm: allow slab_nomerge to be set at build time · 7660a6fd
      Kees Cook authored
      Some hardened environments want to build kernels with slab_nomerge
      already set (so that they do not depend on remembering to set the kernel
      command line option).  This is desired to reduce the risk of kernel heap
      overflows being able to overwrite objects from merged caches and changes
      the requirements for cache layout control, increasing the difficulty of
      these attacks.  By keeping caches unmerged, these kinds of exploits can
      usually only damage objects in the same cache (though the risk to
      metadata exploitation is unchanged).
      
      Link: http://lkml.kernel.org/r/20170620230911.GA25238@beast
      
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: David Windsor <dave@nullcore.net>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: David Windsor <dave@nullcore.net>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Daniel Mack <daniel@zonque.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7660a6fd
  12. Jun 23, 2017
  13. Jun 14, 2017
  14. Jun 09, 2017
  15. Jun 08, 2017
  16. May 23, 2017
  17. May 09, 2017
  18. May 06, 2017
    • Linus Torvalds's avatar
      initramfs: avoid "label at end of compound statement" error · 394e4f5d
      Linus Torvalds authored
      
      Commit 17a9be31 ("initramfs: Always do fput() and load modules after
      rootfs populate") introduced an error for the
      
          CONFIG_BLK_DEV_RAM=y
      
      case, because even though the code looks fine, the compiler really wants
      a statement after a label, or you'll get complaints:
      
        init/initramfs.c: In function 'populate_rootfs':
        init/initramfs.c:644:2: error: label at end of compound statement
      
      That commit moved the subsequent statements to outside the compound
      statement, leaving the label without any associated statements.
      
      Reported-by: default avatarJörg Otte <jrg.otte@gmail.com>
      Fixes: 17a9be31 ("initramfs: Always do fput() and load modules after rootfs populate")
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: stable@vger.kernel.org  # if 17a9be31 gets backported
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      394e4f5d
  19. May 05, 2017
    • Stafford Horne's avatar
      initramfs: Always do fput() and load modules after rootfs populate · 17a9be31
      Stafford Horne authored
      
      In OpenRISC we do not have a bootloader passed initrd, but the built in
      initramfs does contain the /init and other binaries, including modules.
      The previous commit 08865514 ("initramfs: finish fput() before
      accessing any binary from initramfs") made a change to only call fput()
      if the bootloader initrd was available, this caused intermittent crashes
      for OpenRISC.
      
      This patch changes the fput() to happen unconditionally if any rootfs is
      loaded. Also, I added some comments to make it a bit more clear why we
      call unpack_to_rootfs() multiple times.
      
      Fixes: 08865514 ("initramfs: finish fput() before accessing any binary from initramfs")
      Cc: stable@vger.kernel.org
      Cc: Lokesh Vutla <lokeshvutla@ti.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      17a9be31
  20. May 02, 2017
  21. Apr 24, 2017
  22. Apr 19, 2017
  23. Apr 18, 2017
  24. Apr 03, 2017
    • Steven Rostedt (VMware)'s avatar
      ftrace: Have init/main.c call ftrace directly to free init memory · b80f0f6c
      Steven Rostedt (VMware) authored
      
      Relying on free_reserved_area() to call ftrace to free init memory proved to
      not be sufficient. The issue is that on x86, when debug_pagealloc is
      enabled, the init memory is not freed, but simply set as not present. Since
      ftrace was uninformed of this, starting function tracing still tries to
      update pages that are not present according to the page tables, causing
      ftrace to bug, as well as killing the kernel itself.
      
      Instead of relying on free_reserved_area(), have init/main.c call ftrace
      directly just before it frees the init memory. Then it needs to use
      __init_begin and __init_end to know where the init memory location is.
      Looking at all archs (and testing what I can), it appears that this should
      work for each of them.
      
      Reported-by: default avatarkernel test robot <xiaolong.ye@intel.com>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      b80f0f6c
  25. Apr 01, 2017
    • Michal Hocko's avatar
      mm: move mm_percpu_wq initialization earlier · 597b7305
      Michal Hocko authored
      Yang Li has reported that drain_all_pages triggers a WARN_ON which means
      that this function is called earlier than the mm_percpu_wq is
      initialized on arm64 with CMA configured:
      
        WARNING: CPU: 2 PID: 1 at mm/page_alloc.c:2423 drain_all_pages+0x244/0x25c
        Modules linked in:
        CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-next-20170310-00027-g64dfbc5 #127
        Hardware name: Freescale Layerscape 2088A RDB Board (DT)
        task: ffffffc07c4a6d00 task.stack: ffffffc07c4a8000
        PC is at drain_all_pages+0x244/0x25c
        LR is at start_isolate_page_range+0x14c/0x1f0
        [...]
         drain_all_pages+0x244/0x25c
         start_isolate_page_range+0x14c/0x1f0
         alloc_contig_range+0xec/0x354
         cma_alloc+0x100/0x1fc
         dma_alloc_from_contiguous+0x3c/0x44
         atomic_pool_init+0x7c/0x208
         arm64_dma_init+0x44/0x4c
         do_one_initcall+0x38/0x128
         kernel_init_freeable+0x1a0/0x240
         kernel_init+0x10/0xfc
         ret_from_fork+0x10/0x20
      
      Fix this by moving the whole setup_vmstat which is an initcall right now
      to init_mm_internals which will be called right after the WQ subsystem
      is initialized.
      
      Link: http://lkml.kernel.org/r/20170315164021.28532-1-mhocko@kernel.org
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarYang Li <pku.leo@gmail.com>
      Tested-by: default avatarYang Li <pku.leo@gmail.com>
      Tested-by: default avatarXiaolong Ye <xiaolong.ye@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      597b7305
Loading