Skip to content
Snippets Groups Projects
  1. Mar 03, 2020
    • Quentin Perret's avatar
      kbuild: allow symbol whitelisting with TRIM_UNUSED_KSYMS · 1518c633
      Quentin Perret authored
      
      CONFIG_TRIM_UNUSED_KSYMS currently removes all unused exported symbols
      from ksymtab. This works really well when using in-tree drivers, but
      cannot be used in its current form if some of them are out-of-tree.
      
      Indeed, even if the list of symbols required by out-of-tree drivers is
      known at compile time, the only solution today to guarantee these don't
      get trimmed is to set CONFIG_TRIM_UNUSED_KSYMS=n. This not only wastes
      space, but also makes it difficult to control the ABI usable by vendor
      modules in distribution kernels such as Android. Being able to control
      the kernel ABI surface is particularly useful to ship a unique Generic
      Kernel Image (GKI) for all vendors, which is a first step in the
      direction of getting all vendors to contribute their code upstream.
      
      As such, attempt to improve the situation by enabling users to specify a
      symbol 'whitelist' at compile time. Any symbol specified in this
      whitelist will be kept exported when CONFIG_TRIM_UNUSED_KSYMS is set,
      even if it has no in-tree user. The whitelist is defined as a simple
      text file, listing symbols, one per line.
      
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Acked-by: default avatarNicolas Pitre <nico@fluxnic.net>
      Tested-by: default avatarMatthias Maennich <maennich@google.com>
      Reviewed-by: default avatarMatthias Maennich <maennich@google.com>
      Signed-off-by: default avatarQuentin Perret <qperret@google.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      1518c633
    • Masahiro Yamada's avatar
      kbuild: use KBUILD_DEFCONFIG as the fallback for DEFCONFIG_LIST · 2a86f661
      Masahiro Yamada authored
      
      Most of the Kconfig commands (except defconfig and all*config) read
      the .config file as a base set of CONFIG options.
      
      When it does not exist, the files in DEFCONFIG_LIST are searched in
      this order and loaded if found.
      
      I do not see much sense in the last two lines in DEFCONFIG_LIST.
      
      [1] ARCH_DEFCONFIG
      
      The entry for DEFCONFIG_LIST is guarded by 'depends on !UML'. So, the
      ARCH_DEFCONFIG definition in arch/x86/um/Kconfig is meaningless.
      
      arch/{sh,sparc,x86}/Kconfig define ARCH_DEFCONFIG depending on 32 or
      64 bit variant symbols. This is a little bit strange; ARCH_DEFCONFIG
      should be a fixed string because the base config file is loaded before
      the symbol evaluation stage.
      
      Using KBUILD_DEFCONFIG makes more sense because it is fixed before
      Kconfig is invoked. Fortunately, arch/{sh,sparc,x86}/Makefile define it
      in the same way, and it works as expected. Hence, replace ARCH_DEFCONFIG
      with "arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG)".
      
      [2] arch/$(ARCH)/defconfig
      
      This file path is no longer valid. The defconfig files are always located
      in the arch configs/ directories.
      
        $ find arch -name defconfig | sort
        arch/alpha/configs/defconfig
        arch/arm64/configs/defconfig
        arch/csky/configs/defconfig
        arch/nds32/configs/defconfig
        arch/riscv/configs/defconfig
        arch/s390/configs/defconfig
        arch/unicore32/configs/defconfig
      
      The path arch/*/configs/defconfig is already covered by
      "arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG)". So, this file path is
      not necessary.
      
      I moved the default KBUILD_DEFCONFIG to the top Makefile. Otherwise,
      the 7 architectures listed above would end up with endless loop of
      syncconfig.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      2a86f661
  2. Feb 26, 2020
  3. Feb 20, 2020
  4. Feb 10, 2020
  5. Feb 05, 2020
  6. Jan 31, 2020
  7. Jan 21, 2020
  8. Jan 19, 2020
    • Johannes Berg's avatar
      Revert "um: Enable CONFIG_CONSTRUCTORS" · 87c9366e
      Johannes Berg authored
      
      This reverts commit 786b2384 ("um: Enable CONFIG_CONSTRUCTORS").
      
      There are two issues with this commit, uncovered by Anton in tests
      on some (Debian) systems:
      
      1) I completely forgot to call any constructors if CONFIG_CONSTRUCTORS
         isn't set. Don't recall now if it just wasn't needed on my system, or
         if I never tested this case.
      
      2) With that fixed, it works - with CONFIG_CONSTRUCTORS *unset*. If I
         set CONFIG_CONSTRUCTORS, it fails again, which isn't totally
         unexpected since whatever wanted to run is likely to have to run
         before the kernel init etc. that calls the constructors in this case.
      
      Basically, some constructors that gcc emits (libc has?) need to run
      very early during init; the failure mode otherwise was that the ptrace
      fork test already failed:
      
      ----------------------
      $ ./linux mem=512M
      Core dump limits :
      	soft - 0
      	hard - NONE
      Checking that ptrace can change system call numbers...check_ptrace : child exited with exitcode 6, while expecting 0; status 0x67f
      Aborted
      ----------------------
      
      Thinking more about this, it's clear that we simply cannot support
      CONFIG_CONSTRUCTORS in UML. All the cases we need now (gcov, kasan)
      involve not use of the __attribute__((constructor)), but instead
      some constructor code/entry generated by gcc. Therefore, we cannot
      distinguish between kernel constructors and system constructors.
      
      Thus, revert this commit.
      
      Cc: stable@vger.kernel.org [5.4+]
      Fixes: 786b2384 ("um: Enable CONFIG_CONSTRUCTORS")
      Reported-by: default avatarAnton Ivanov <anton.ivanov@cambridgegreys.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Acked-by: default avatarAnton Ivanov <anton.ivanov@cambridgegreys.co.uk>
      
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      87c9366e
  9. Jan 14, 2020
    • Thomas Gleixner's avatar
      lib/vdso: Prepare for time namespace support · 660fd04f
      Thomas Gleixner authored
      
      To support time namespaces in the vdso with a minimal impact on regular non
      time namespace affected tasks, the namespace handling needs to be hidden in
      a slow path.
      
      The most obvious place is vdso_seq_begin(). If a task belongs to a time
      namespace then the VVAR page which contains the system wide vdso data is
      replaced with a namespace specific page which has the same layout as the
      VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path
      and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time
      namespace handling path.
      
      The extra check in the case that vdso_data->seq is odd, e.g. a concurrent
      update of the vdso data is in progress, is not really affecting regular
      tasks which are not part of a time namespace as the task is spin waiting
      for the update to finish and vdso_data->seq to become even again.
      
      If a time namespace task hits that code path, it invokes the corresponding
      time getter function which retrieves the real VVAR page, reads host time
      and then adds the offset for the requested clock which is stored in the
      special VVAR page.
      
      If VDSO time namespace support is disabled the whole magic is compiled out.
      
      Initial testing shows that the disabled case is almost identical to the
      host case which does not take the slow timens path. With the special timens
      page installed the performance hit is constant time and in the range of
      5-7%.
      
      For the vdso functions which are not using the sequence count an
      unconditional check for vdso_data->clock_mode is added which switches to
      the real vdso when the clock_mode is VCLOCK_TIMENS.
      
      [avagin: Make do_hres_timens() work with raw clocks too: choose vdso_data
       pointer by CS_RAW offset.]
      
      Suggested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrei Vagin <avagin@gmail.com>
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20191112012724.250792-21-dima@arista.com
      
      660fd04f
    • Andrei Vagin's avatar
      ns: Introduce Time Namespace · 769071ac
      Andrei Vagin authored
      
      Time Namespace isolates clock values.
      
      The kernel provides access to several clocks CLOCK_REALTIME,
      CLOCK_MONOTONIC, CLOCK_BOOTTIME, etc.
      
      CLOCK_REALTIME
            System-wide clock that measures real (i.e., wall-clock) time.
      
      CLOCK_MONOTONIC
            Clock that cannot be set and represents monotonic time since
            some unspecified starting point.
      
      CLOCK_BOOTTIME
            Identical to CLOCK_MONOTONIC, except it also includes any time
            that the system is suspended.
      
      For many users, the time namespace means the ability to changes date and
      time in a container (CLOCK_REALTIME). Providing per namespace notions of
      CLOCK_REALTIME would be complex with a massive overhead, but has a dubious
      value.
      
      But in the context of checkpoint/restore functionality, monotonic and
      boottime clocks become interesting. Both clocks are monotonic with
      unspecified starting points. These clocks are widely used to measure time
      slices and set timers. After restoring or migrating processes, it has to be
      guaranteed that they never go backward. In an ideal case, the behavior of
      these clocks should be the same as for a case when a whole system is
      suspended. All this means that it is required to set CLOCK_MONOTONIC and
      CLOCK_BOOTTIME clocks, which can be achieved by adding per-namespace
      offsets for clocks.
      
      A time namespace is similar to a pid namespace in the way how it is
      created: unshare(CLONE_NEWTIME) system call creates a new time namespace,
      but doesn't set it to the current process. Then all children of the process
      will be born in the new time namespace, or a process can use the setns()
      system call to join a namespace.
      
      This scheme allows setting clock offsets for a namespace, before any
      processes appear in it.
      
      All available clone flags have been used, so CLONE_NEWTIME uses the highest
      bit of CSIGNAL. It means that it can be used only with the unshare() and
      the clone3() system calls.
      
      [ tglx: Adjusted paragraph about clone3() to reality and massaged the
        	changelog a bit. ]
      
      Co-developed-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarAndrei Vagin <avagin@gmail.com>
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://criu.org/Time_namespace
      Link: https://lists.openvz.org/pipermail/criu/2018-June/041504.html
      Link: https://lore.kernel.org/r/20191112012724.250792-4-dima@arista.com
      
      769071ac
    • Vlastimil Babka's avatar
      mm, debug_pagealloc: don't rely on static keys too early · 8e57f8ac
      Vlastimil Babka authored
      Commit 96a2b03f ("mm, debug_pagelloc: use static keys to enable
      debugging") has introduced a static key to reduce overhead when
      debug_pagealloc is compiled in but not enabled.  It relied on the
      assumption that jump_label_init() is called before parse_early_param()
      as in start_kernel(), so when the "debug_pagealloc=on" option is parsed,
      it is safe to enable the static key.
      
      However, it turns out multiple architectures call parse_early_param()
      earlier from their setup_arch().  x86 also calls jump_label_init() even
      earlier, so no issue was found while testing the commit, but same is not
      true for e.g.  ppc64 and s390 where the kernel would not boot with
      debug_pagealloc=on as found by our QA.
      
      To fix this without tricky changes to init code of multiple
      architectures, this patch partially reverts the static key conversion
      from 96a2b03f.  Init-time and non-fastpath calls (such as in arch
      code) of debug_pagealloc_enabled() will again test a simple bool
      variable.  Fastpath mm code is converted to a new
      debug_pagealloc_enabled_static() variant that relies on the static key,
      which is enabled in a well-defined point in mm_init() where it's
      guaranteed that jump_label_init() has been called, regardless of
      architecture.
      
      [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early]
        Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz
      
      
      Fixes: 96a2b03f ("mm, debug_pagelloc: use static keys to enable debugging")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Qian Cai <cai@lca.pw>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e57f8ac
  10. Jan 13, 2020
  11. Jan 03, 2020
    • Dominik Brodowski's avatar
      Revert "fs: remove ksys_dup()" · 74f1a299
      Dominik Brodowski authored
      
      This reverts commit 8243186f ("fs: remove ksys_dup()") and the
      subsequent fix for it in commit 2d3145f8 ("early init: fix error
      handling when opening /dev/console").
      
      Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls
      caused more trouble than what is worth it: it requires accessing vfs
      internals and it turns out there were other bugs in it too.
      
      In particular, the file reference counting was wrong - because unlike
      the original "open+2*dup" sequence it used "filp_open+3*f_dupfd" and
      thus had an extra leaked file reference.
      
      That in turn then caused odd problems with Androidx86 long after boot
      becaue of how the extra reference to the console kept the session active
      even after all file descriptors had been closed.
      
      Reported-by: default avataryouling 257 <youling257@gmail.com>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74f1a299
  12. Dec 17, 2019
  13. Dec 16, 2019
  14. Dec 13, 2019
  15. Dec 12, 2019
  16. Dec 05, 2019
  17. Nov 26, 2019
    • Eric W. Biederman's avatar
      sysctl: Remove the sysctl system call · 61a47c1a
      Eric W. Biederman authored
      
      This system call has been deprecated almost since it was introduced, and
      in a survey of the linux distributions I can no longer find any of them
      that enable CONFIG_SYSCTL_SYSCALL.  The only indication that I can find
      that anyone might care is that a few of the defconfigs in the kernel
      enable CONFIG_SYSCTL_SYSCALL.  However this appears in only 31 of 414
      defconfigs in the kernel, so I suspect this symbols presence is simply
      because it is harmless to include rather than because it is necessary.
      
      As there appear to be no users of the sysctl system call, remove the
      code.  As this removes one of the few uses of the internal kernel mount
      of proc I hope this allows for even more simplifications of the proc
      filesystem.
      
      Cc: Alex Smith <alex.smith@imgtec.com>
      Cc: Anders Berg <anders.berg@lsi.com>
      Cc: Apelete Seketeli <apelete@seketeli.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Chee Nouk Phoon <cnphoon@altera.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Christian Ruppert <christian.ruppert@abilis.com>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Harvey Hunt <harvey.hunt@imgtec.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hongliang Tao <taohl@lemote.com>
      Cc: Hua Yan <yanh@lemote.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Jonas Jensen <jonas.jensen@gmail.com>
      Cc: Josh Boyer <jwboyer@gmail.com>
      Cc: Jun Nie <jun.nie@linaro.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Kevin Wells <kevin.wells@nxp.com>
      Cc: Kumar Gala <galak@codeaurora.org>
      Cc: Lars-Peter Clausen <lars@metafoo.de>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Phil Edworthy <phil.edworthy@renesas.com>
      Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Roland Stigge <stigge@antcom.de>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Scott Telford <stelford@cadence.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Tanmay Inamdar <tinamdar@apm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Wolfram Sang <w.sang@pengutronix.de>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      61a47c1a
  18. Nov 17, 2019
    • Ard Biesheuvel's avatar
      int128: move __uint128_t compiler test to Kconfig · c12d3362
      Ard Biesheuvel authored
      
      In order to use 128-bit integer arithmetic in C code, the architecture
      needs to have declared support for it by setting ARCH_SUPPORTS_INT128,
      and it requires a version of the toolchain that supports this at build
      time. This is why all existing tests for ARCH_SUPPORTS_INT128 also test
      whether __SIZEOF_INT128__ is defined, since this is only the case for
      compilers that can support 128-bit integers.
      
      Let's fold this additional test into the Kconfig declaration of
      ARCH_SUPPORTS_INT128 so that we can also use the symbol in Makefiles,
      e.g., to decide whether a certain object needs to be included in the
      first place.
      
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c12d3362
Loading