- May 05, 2017
-
-
Stafford Horne authored
In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit 08865514 ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: 08865514 ("initramfs: finish fput() before accessing any binary from initramfs") Cc: stable@vger.kernel.org Cc: Lokesh Vutla <lokeshvutla@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Stafford Horne <shorne@gmail.com>
-
- Apr 01, 2017
-
-
Michal Hocko authored
Yang Li has reported that drain_all_pages triggers a WARN_ON which means that this function is called earlier than the mm_percpu_wq is initialized on arm64 with CMA configured: WARNING: CPU: 2 PID: 1 at mm/page_alloc.c:2423 drain_all_pages+0x244/0x25c Modules linked in: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-next-20170310-00027-g64dfbc5 #127 Hardware name: Freescale Layerscape 2088A RDB Board (DT) task: ffffffc07c4a6d00 task.stack: ffffffc07c4a8000 PC is at drain_all_pages+0x244/0x25c LR is at start_isolate_page_range+0x14c/0x1f0 [...] drain_all_pages+0x244/0x25c start_isolate_page_range+0x14c/0x1f0 alloc_contig_range+0xec/0x354 cma_alloc+0x100/0x1fc dma_alloc_from_contiguous+0x3c/0x44 atomic_pool_init+0x7c/0x208 arm64_dma_init+0x44/0x4c do_one_initcall+0x38/0x128 kernel_init_freeable+0x1a0/0x240 kernel_init+0x10/0xfc ret_from_fork+0x10/0x20 Fix this by moving the whole setup_vmstat which is an initcall right now to init_mm_internals which will be called right after the WQ subsystem is initialized. Link: http://lkml.kernel.org/r/20170315164021.28532-1-mhocko@kernel.org Signed-off-by:
Michal Hocko <mhocko@suse.com> Reported-by:
Yang Li <pku.leo@gmail.com> Tested-by:
Yang Li <pku.leo@gmail.com> Tested-by:
Xiaolong Ye <xiaolong.ye@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Mar 02, 2017
-
-
Ingo Molnar authored
But first introduce a trivial header and update usage sites. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
sched/headers, vfs/execve: Prepare to move the do_execve*() prototypes from <linux/sched.h> to <linux/binfmts.h> But first update the usage sites with the new header dependency. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> Update all usage sites first. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task_stack.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
We are going to split <linux/sched/task.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. <linux/nmi.h> already includes <linux/sched.h>. Include the <linux/nmi.h> header in the files that are going to need it. Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- Feb 28, 2017
-
-
Jinbum Park authored
This patch makes arch-independent testcases for RODATA. Both x86 and x86_64 already have testcases for RODATA, But they are arch-specific because using inline assembly directly. And cacheflush.h is not a suitable location for rodata-test related things. Since they were in cacheflush.h, If someone change the state of CONFIG_DEBUG_RODATA_TEST, It cause overhead of kernel build. To solve the above issues, write arch-independent testcases and move it to shared location. [jinb.park7@gmail.com: fix config dependency] Link: http://lkml.kernel.org/r/20170209131625.GA16954@pjb1027-Latitude-E5410 Link: http://lkml.kernel.org/r/20170129105436.GA9303@pjb1027-Latitude-E5410 Signed-off-by:
Jinbum Park <jinb.park7@gmail.com> Acked-by:
Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Lokesh Vutla authored
Commit 4a9d4b02 ("switch fput to task_work_add") implements a schedule_work() for completing fput(), but did not guarantee calling __fput() after unpacking initramfs. Because of this, there is a possibility that during boot a driver can see ETXTBSY when it tries to load a binary from initramfs as fput() is still pending on that binary. This patch makes sure that fput() is completed after unpacking initramfs and removes the call to flush_delayed_fput() in kernel_init() which happens very late after unpacking initramfs. Link: http://lkml.kernel.org/r/20170201140540.22051-1-lokeshvutla@ti.com Signed-off-by:
Lokesh Vutla <lokeshvutla@ti.com> Reported-by:
Murali Karicheri <m-karicheri2@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Tero Kristo <t-kristo@ti.com> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Nishanth Menon <nm@ti.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Feb 23, 2017
-
-
Tejun Heo authored
SLUB creates a per-cache directory under /sys/kernel/slab which hosts a bunch of debug files. Usually, there aren't that many caches on a system and this doesn't really matter; however, if memcg is in use, each cache can have per-cgroup sub-caches. SLUB creates the same directories for these sub-caches under /sys/kernel/slab/$CACHE/cgroup. Unfortunately, because there can be a lot of cgroups, active or draining, the product of the numbers of caches, cgroups and files in each directory can reach a very high number - hundreds of thousands is commonplace. Millions and beyond aren't difficult to reach either. What's under /sys/kernel/slab is primarily for debugging and the information and control on the a root cache already cover its sub-caches. While having a separate directory for each sub-cache can be helpful for development, it doesn't make much sense to pay this amount of overhead by default. This patch introduces a boot parameter slub_memcg_sysfs which determines whether to create sysfs directories for per-memcg sub-caches. It also adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's default value and defaults to 0. [akpm@linux-foundation.org: kset_unregister(NULL) is legal] Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Feb 14, 2017
-
-
Matthew Wilcox authored
The IDR is very similar to the radix tree. It has some functionality that the radix tree did not have (alloc next free, cyclic allocation, a callback-based for_each, destroy tree), which is readily implementable on top of the radix tree. A few small changes were needed in order to use a tag to represent nodes with free space below them. More extensive changes were needed to support storing NULL as a valid entry in an IDR. Plain radix trees still interpret NULL as a not-present entry. The IDA is reimplemented as a client of the newly enhanced radix tree. As in the current implementation, it uses a bitmap at the last level of the tree. Signed-off-by:
Matthew Wilcox <willy@infradead.org> Signed-off-by:
Matthew Wilcox <mawilcox@microsoft.com> Tested-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Feb 09, 2017
-
-
Paul Gortmaker authored
These files were including module.h for exception table related functions. We've now separated that content out into its own file "extable.h" so now move over to that and where possible, avoid all the extra header content in module.h that we don't really need to compile these non-modular files. Note: init/main.c still needs module.h for __init_or_module kernel/extable.c still needs module.h for is_module_text_address ...and so we don't get the benefit of removing module.h from the cpp feed for these two files, unlike the almost universal 1:1 exchange of module.h for extable.h we were able to do in the arch dirs. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Acked-by:
Jessica Yu <jeyu@redhat.com> Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com>
-
- Feb 08, 2017
-
-
Sergey Senozhatsky authored
A preparation patch for printk_safe work. No functional change. - rename nmi.c to print_safe.c - add `printk_safe' prefix to some (which used both by printk-safe and printk-nmi) of the exported functions. Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by:
Petr Mladek <pmladek@suse.com>
-
- Feb 07, 2017
-
-
Laura Abbott authored
Both of these options are poorly named. The features they provide are necessary for system security and should not be considered debug only. Change the names to CONFIG_STRICT_KERNEL_RWX and CONFIG_STRICT_MODULE_RWX to better describe what these options do. Signed-off-by:
Laura Abbott <labbott@redhat.com> Acked-by:
Jessica Yu <jeyu@redhat.com> Signed-off-by:
Kees Cook <keescook@chromium.org>
-
- Feb 03, 2017
-
-
Ard Biesheuvel authored
This add the kbuild infrastructure that will allow architectures to emit vmlinux symbol CRCs as 32-bit offsets to another location in the kernel where the actual value is stored. This works around problems with CRCs being mistaken for relocatable symbols on kernels that self relocate at runtime (i.e., powerpc with CONFIG_RELOCATABLE=y) For the kbuild side of things, this comes down to the following: - introducing a Kconfig symbol MODULE_REL_CRCS - adding a -R switch to genksyms to instruct it to emit the CRC symbols as references into the .rodata section - making modpost distinguish such references from absolute CRC symbols by the section index (SHN_ABS) - making kallsyms disregard non-absolute symbols with a __crc_ prefix Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Feb 01, 2017
-
-
Dave Young authored
Before invoking the arch specific handler, efi_mem_reserve() reserves the given memory region through memblock. efi_bgrt_init() will call efi_mem_reserve() after mm_init(), at which time memblock is dead and should not be used anymore. The EFI BGRT code depends on ACPI initialization to get the BGRT ACPI table, so move parsing of the BGRT table to ACPI early boot code to ensure that efi_mem_reserve() in EFI BGRT code still use memblock safely. Tested-by:
Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by:
Dave Young <dyoung@redhat.com> Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-9-git-send-email-ard.biesheuvel@linaro.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- Jan 27, 2017
-
-
Jason A. Donenfeld authored
Now that our crng uses chacha20, we can rely on its speedy characteristics for replacing MD5, while simultaneously achieving a higher security guarantee. Before the idea was to use these functions if you wanted random integers that aren't stupidly insecure but aren't necessarily secure either, a vague gray zone, that hopefully was "good enough" for its users. With chacha20, we can strengthen this claim, since either we're using an rdrand-like instruction, or we're using the same crng as /dev/urandom. And it's faster than what was before. We could have chosen to replace this with a SipHash-derived function, which might be slightly faster, but at the cost of having yet another RNG construction in the kernel. By moving to chacha20, we have a single RNG to analyze and verify, and we also already get good performance improvements on all platforms. Implementation-wise, rather than use a generic buffer for both get_random_int/long and memcpy based on the size needs, we use a specific buffer for 32-bit reads and for 64-bit reads. This way, we're guaranteed to always have aligned accesses on all platforms. While slightly more verbose in C, the assembly this generates is a lot simpler than otherwise. Finally, on 32-bit platforms where longs and ints are the same size, we simply alias get_random_int to get_random_long. Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com> Suggested-by:
Theodore Ts'o <tytso@mit.edu> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by:
Theodore Ts'o <tytso@mit.edu>
-
- Jan 23, 2017
-
-
Paul E. McKenney authored
Now that User Mode Linux supports arch_irqs_disabled_flags(), this commit re-enables TASKS_RCU for User Mode Linux. Reported-by:
Richard Weinberger <richard@nod.at> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
- Jan 19, 2017
-
-
William Breathitt Gray authored
PC/104 form factor devices serve a specific niche of embedded system users; most Linux users will not have PC/104 form factor devices. This patch introduces the PC104 Kconfig option, which should be used to filter PC/104 specific device drivers and options, so that only those users interested in PC/104 related options are exposed to them. Signed-off-by:
William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Jan 17, 2017
-
-
Sebastian Andrzej Siewior authored
RCU_EXPEDITE_BOOT should speed up the boot process by enforcing synchronize_rcu_expedited() instead of synchronize_rcu() during the boot process. There should be no reason why one does not want this and there is no need worry about real time latency at this point. Therefore make it default. Note that users wishing to avoid expediting entirely, for example when bringing up new hardware possibly having flaky IPIs, can use the rcu_normal boot parameter to override boot-time expediting. Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Reworded commit log. ] Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
- Jan 14, 2017
-
-
Peter Zijlstra authored
Since we need to change the implementation, stop exposing internals. Provide KREF_INIT() to allow static initialization of struct kref. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Currently we switch to the stable sched_clock if we guess the TSC is usable, and then switch back to the unstable path if it turns out TSC isn't stable during SMP bringup after all. Delay switching to the stable path until after SMP bringup is complete. This way we'll avoid switching during the time we detect the worst of the TSC offences. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- Jan 11, 2017
-
-
Arnd Bergmann authored
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is not right when CONFIG_NET is disabled and there is no socket interface: warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET) I don't know what the correct solution for this is, but simply removing the dependency on NET from SOCK_CGROUP_DATA by moving it out of the 'if NET' section avoids the warning and does not produce other build errors. Fixes: 483c4933 ("cgroup: Fix CGROUP_BPF config") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 10, 2017
-
-
Parav Pandit authored
Added rdma cgroup controller that does accounting, limit enforcement on rdma/IB resources. Added rdma cgroup header file which defines its APIs to perform charging/uncharging functionality. It also defined APIs for RDMA/IB stack for device registration. Devices which are registered will participate in controller functions of accounting and limit enforcements. It define rdmacg_device structure to bind IB stack and RDMA cgroup controller. RDMA resources are tracked using resource pool. Resource pool is per device, per cgroup entity which allows setting up accounting limits on per device basis. Currently resources are defined by the RDMA cgroup. Resource pool is created/destroyed dynamically whenever charging/uncharging occurs respectively and whenever user configuration is done. Its a tradeoff of memory vs little more code space that creates resource pool object whenever necessary, instead of creating them during cgroup creation and device registration time. Signed-off-by:
Parav Pandit <pandit.parav@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Dec 25, 2016
-
-
Nicholas Piggin authored
Add a new page flag, PageWaiters, to indicate the page waitqueue has tasks waiting. This can be tested rather than testing waitqueue_active which requires another cacheline load. This bit is always set when the page has tasks on page_waitqueue(page), and is set and cleared under the waitqueue lock. It may be set when there are no tasks on the waitqueue, which will cause a harmless extra wakeup check that will clears the bit. The generic bit-waitqueue infrastructure is no longer used for pages. Instead, waitqueues are used directly with a custom key type. The generic code was not flexible enough to have PageWaiters manipulation under the waitqueue lock (which simplifies concurrency). This improves the performance of page lock intensive microbenchmarks by 2-3%. Putting two bits in the same word opens the opportunity to remove the memory barrier between clearing the lock bit and testing the waiters bit, after some work on the arch primitives (e.g., ensuring memory operand widths match and cover both bits). Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 24, 2016
-
-
Linus Torvalds authored
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 18, 2016
-
-
Andy Lutomirski authored
CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually enabled, making it rather challenging to turn CGROUP_BPF on. Signed-off-by:
Andy Lutomirski <luto@kernel.org> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 13, 2016
-
-
Jungseung Lee authored
For several devices, the rootwait time is sensitive because it directly affects booting time. The polling interval of rootwait is currently 100ms. To save unnessesary waiting time, reduce the polling interval to 5 ms. [akpm@linux-foundation.org: remove used-once #define] Link: http://lkml.kernel.org/r/20161207060743.1728-1-js07.lee@samsung.com Signed-off-by:
Jungseung Lee <js07.lee@samsung.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Dec 09, 2016
-
-
Thomas Gleixner authored
AMD CPUs affected by the E400 erratum suffer from the issue that the local APIC timer stops when the CPU goes into C1E. Unfortunately there is no way to detect the affected CPUs on early boot. It's only possible to determine the range of possibly affected CPUs from the family/model range. The actual decision whether to enter C1E and thus cause the bug is done by the firmware and we need to detect that case late, after ACPI has been initialized. The current solution is to check in the idle routine whether the CPU is affected by reading the MSR_K8_INT_PENDING_MSG MSR and checking for the K8_INTP_C1E_ACTIVE_MASK bits. If one of the bits is set then the CPU is affected and the system is switched into forced broadcast mode. This is ineffective and on non-affected CPUs every entry to idle does the extra RDMSR. After doing some research it turns out that the bits are visible on the boot CPU right after the ACPI subsystem is initialized in the early boot process. So instead of polling for the bits in the idle loop, add a detection function after acpi_subsystem_init() and check for the MSR bits. If set, then the X86_BUG_AMD_APIC_C1E is set on the boot CPU and the TSC is marked unstable when X86_FEATURE_NONSTOP_TSC is not set as it will stop in C1E state as well. The switch to broadcast mode cannot be done at this point because the boot CPU still uses HPET as a clockevent device and the local APIC timer is not yet calibrated and installed. The switch to broadcast mode on the affected CPUs needs to be done when the local APIC timer is actually set up. This allows to cleanup the amd_e400_idle() function in the next step. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-4-bp@alien8.de Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- Nov 30, 2016
-
-
Linus Torvalds authored
This enables CONFIG_MODVERSIONS again, but allows for missing symbol CRC information in order to work around the issue that newer binutils versions seem to occasionally drop the CRC on the floor. binutils 2.26 seems to work fine, while binutils 2.27 seems to break MODVERSIONS of symbols that have been defined in assembler files. [ We've had random missing CRC's before - it may be an old problem that just is now reliably triggered with the weak asm symbols and a new version of binutils ] Some day I really do want to remove MODVERSIONS entirely. Sadly, today does not appear to be that day: Debian people apparently do want the option to enable MODVERSIONS to make it easier to have external modules across kernel versions, and this seems to be a fairly minimal fix for the annoying problem. Cc: Ben Hutchings <ben@decadent.org.uk> Acked-by:
Michal Marek <mmarek@suse.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Nov 28, 2016
-
-
Arnd Bergmann authored
The newly added 'rodata_enabled' global variable is protected by the wrong #ifdef, leading to a link error when CONFIG_DEBUG_SET_MODULE_RONX is turned on: kernel/module.o: In function `disable_ro_nx': module.c:(.text.unlikely.disable_ro_nx+0x88): undefined reference to `rodata_enabled' kernel/module.o: In function `module_disable_ro': module.c:(.text.module_disable_ro+0x8c): undefined reference to `rodata_enabled' kernel/module.o: In function `module_enable_ro': module.c:(.text.module_enable_ro+0xb0): undefined reference to `rodata_enabled' CONFIG_SET_MODULE_RONX does not exist, so use the correct one instead. Fixes: 39290b38 ("module: extend 'rodata=off' boot cmdline parameter to module mappings") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Jessica Yu <jeyu@redhat.com>
-
AKASHI Takahiro authored
The current "rodata=off" parameter disables read-only kernel mappings under CONFIG_DEBUG_RODATA: commit d2aa1aca ("mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings") This patch is a logical extension to module mappings ie. read-only mappings at module loading can be disabled even if CONFIG_DEBUG_SET_MODULE_RONX (mainly for debug use). Please note, however, that it only affects RO/RW permissions, keeping NX set. This is the first step to make CONFIG_DEBUG_SET_MODULE_RONX mandatory (always-on) in the future as CONFIG_DEBUG_RODATA on x86 and arm64. Suggested-by:
and Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by:
AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Acked-by:
Rusty Russell <rusty@rustcorp.com.au> Link: http://lkml.kernel.org/r/20161114061505.15238-1-takahiro.akashi@linaro.org Signed-off-by:
Jessica Yu <jeyu@redhat.com>
-
- Nov 25, 2016
-
-
Linus Torvalds authored
CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series, and quite frankly, nobody has cared very deeply. We absolutely know how to fix it, and it's not _complicated_, but it's not exactly pretty either. This oneliner fixes it without the ugliness, and allows for further future cleanups. "We've secretly replaced their regular MODVERSIONS with nothing at all, let's see if they notice" Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Daniel Mack authored
This patch adds two sets of eBPF program pointers to struct cgroup. One for such that are directly pinned to a cgroup, and one for such that are effective for it. To illustrate the logic behind that, assume the following example cgroup hierarchy. A - B - C \ D - E If only B has a program attached, it will be effective for B, C, D and E. If D then attaches a program itself, that will be effective for both D and E, and the program in B will only affect B and C. Only one program of a given type is effective for a cgroup. Attaching and detaching programs will be done through the bpf(2) syscall. For now, ingress and egress inet socket filtering are the only supported use-cases. Signed-off-by:
Daniel Mack <daniel@zonque.org> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 24, 2016
-
-
Nicolas Schichan authored
Otherwise each individual rotator char would be printed in a new line: (...) [ 0.642350] - [ 0.644374] | [ 0.646367] - (...) Signed-off-by:
Nicolas Schichan <nicolas.schichan@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Nov 16, 2016
-
-
Nicolas Pitre authored
Some embedded systems have no use for them. This removes about 25KB from the kernel binary size when configured out. Corresponding syscalls are routed to a stub logging the attempt to use those syscalls which should be enough of a clue if they were disabled without proper consideration. They are: timer_create, timer_gettime: timer_getoverrun, timer_settime, timer_delete, clock_adjtime, setitimer, getitimer, alarm. The clock_settime, clock_gettime, clock_getres and clock_nanosleep syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME, CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast majority of use cases with very little code. Signed-off-by:
Nicolas Pitre <nico@linaro.org> Acked-by:
Richard Cochran <richardcochran@gmail.com> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
John Stultz <john.stultz@linaro.org> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: linux-kbuild@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Michal Marek <mmarek@suse.com> Cc: Edward Cree <ecree@solarflare.com> Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- Oct 24, 2016
-
-
Mauro Carvalho Chehab authored
The previous patch renamed several files that are cross-referenced along the Kernel documentation. Adjust the links to point to the right places. Signed-off-by:
Mauro Carvalho Chehab <mchehab@s-opensource.com>
-
- Oct 11, 2016
-
-
Peter Zijlstra authored
Relay avoids calling wake_up_interruptible() for doing the wakeup of readers/consumers, waiting for the generation of new data, from the context of a process which produced the data. This is apparently done to prevent the possibility of a deadlock in case Scheduler itself is is generating data for the relay, after acquiring rq->lock. The following patch used a timer (to be scheduled at next jiffy), for delegating the wakeup to another context. commit 7c9cb383 Author: Tom Zanussi <zanussi@comcast.net> Date: Wed May 9 02:34:01 2007 -0700 relay: use plain timer instead of delayed work relay doesn't need to use schedule_delayed_work() for waking readers when a simple timer will do. Scheduling a plain timer, at next jiffies boundary, to do the wakeup causes a significant wakeup latency for the Userspace client, which makes relay less suitable for the high-frequency low-payload use cases where the data gets generated at a very high rate, like multiple sub buffers getting filled within a milli second. Moreover the timer is re-scheduled on every newly produced sub buffer so the timer keeps getting pushed out if sub buffers are filled in a very quick succession (less than a jiffy gap between filling of 2 sub buffers). As a result relay runs out of sub buffers to store the new data. By using irq_work it is ensured that wakeup of userspace client, blocked in the poll call, is done at earliest (through self IPI or next timer tick) enabling it to always consume the data in time. Also this makes relay consistent with printk & ring buffers (trace), as they too use irq_work for deferred wake up of readers. [arnd@arndb.de: select CONFIG_IRQ_WORK] Link: http://lkml.kernel.org/r/20160912154035.3222156-1-arnd@arndb.de [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1472906487-1559-1-git-send-email-akash.goel@intel.com Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Akash Goel <akash.goel@intel.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 10, 2016
-
-
Emese Revfy authored
This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals. The need for very-early boot entropy tends to be very architecture or system design specific, so this plugin is more suited for those sorts of special cases. The existing kernel RNG already attempts to extract entropy from reliable runtime variation, but this plugin takes the idea to a logical extreme by permuting a global variable based on any variation in code execution (e.g. a different value (and permutation function) is used to permute the global based on loop count, case statement, if/then/else branching, etc). To do this, the plugin starts by inserting a local variable in every marked function. The plugin then adds logic so that the value of this variable is modified by randomly chosen operations (add, xor and rol) and random values (gcc generates separate static values for each location at compile time and also injects the stack pointer at runtime). The resulting value depends on the control flow path (e.g., loops and branches taken). Before the function returns, the plugin mixes this local variable into the latent_entropy global variable. The value of this global variable is added to the kernel entropy pool in do_one_initcall() and _do_fork(), though it does not credit any bytes of entropy to the pool; the contents of the global are just used to mix the pool. Additionally, the plugin can pre-initialize arrays with build-time random contents, so that two different kernel builds running on identical hardware will not have the same starting values. Signed-off-by:
Emese Revfy <re.emese@gmail.com> [kees: expanded commit message and code comments] Signed-off-by:
Kees Cook <keescook@chromium.org>
-