Skip to content
Snippets Groups Projects
  1. May 15, 2021
    • Peter Zijlstra's avatar
      openrisc: Define memory barrier mb · 8b549c18
      Peter Zijlstra authored
      
      This came up in the discussion of the requirements of qspinlock on an
      architecture.  OpenRISC uses qspinlock, but it was noticed that the
      memmory barrier was not defined.
      
      Peter defined it in the mail thread writing:
      
          As near as I can tell this should do. The arch spec only lists
          this one instruction and the text makes it sound like a completion
          barrier.
      
      This is correct so applying this patch.
      
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      [shorne@gmail.com:Turned the mail into a patch]
      Signed-off-by: default avatarStafford Horne <shorne@gmail.com>
      8b549c18
  2. May 09, 2021
  3. Apr 28, 2021
  4. Apr 23, 2021
  5. Apr 22, 2021
  6. Apr 21, 2021
    • Kan Liang's avatar
      perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3 · 9d480158
      Kan Liang authored
      
      There may be a kernel panic on the Haswell server and the Broadwell
      server, if the snbep_pci2phy_map_init() return error.
      
      The uncore_extra_pci_dev[HSWEP_PCI_PCU_3] is used in the cpu_init() to
      detect the existence of the SBOX, which is a MSR type of PMON unit.
      The uncore_extra_pci_dev is allocated in the uncore_pci_init(). If the
      snbep_pci2phy_map_init() returns error, perf doesn't initialize the
      PCI type of the PMON units, so the uncore_extra_pci_dev will not be
      allocated. But perf may continue initializing the MSR type of PMON
      units. A null dereference kernel panic will be triggered.
      
      The sockets in a Haswell server or a Broadwell server are identical.
      Only need to detect the existence of the SBOX once.
      Current perf probes all available PCU devices and stores them into the
      uncore_extra_pci_dev. It's unnecessary.
      Use the pci_get_device() to replace the uncore_extra_pci_dev. Only
      detect the existence of the SBOX on the first available PCU device once.
      
      Factor out hswep_has_limit_sbox(), since the Haswell server and the
      Broadwell server uses the same way to detect the existence of the SBOX.
      
      Add some macros to replace the magic number.
      
      Fixes: 5306c31c ("perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes")
      Reported-by: default avatarSteve Wahl <steve.wahl@hpe.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarSteve Wahl <steve.wahl@hpe.com>
      Link: https://lkml.kernel.org/r/1618521764-100923-1-git-send-email-kan.liang@linux.intel.com
      9d480158
  7. Apr 20, 2021
    • Mike Galbraith's avatar
      x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access · 5849cdf8
      Mike Galbraith authored
      
      Commit in Fixes: added support for kexec-ing a kernel on panic using a
      new system call. As part of it, it does prepare a memory map for the new
      kernel.
      
      However, while doing so, it wrongly accesses memory it has not
      allocated: it accesses the first element of the cmem->ranges[] array in
      memmap_exclude_ranges() but it has not allocated the memory for it in
      crash_setup_memmap_entries(). As KASAN reports:
      
        BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0
        Write of size 8 at addr ffffc90000426008 by task kexec/1187
      
        (gdb) list *crash_setup_memmap_entries+0x17e
        0xffffffff8107cafe is in crash_setup_memmap_entries (arch/x86/kernel/crash.c:322).
        317                                      unsigned long long mend)
        318     {
        319             unsigned long start, end;
        320
        321             cmem->ranges[0].start = mstart;
        322             cmem->ranges[0].end = mend;
        323             cmem->nr_ranges = 1;
        324
        325             /* Exclude elf header region */
        326             start = image->arch.elf_load_addr;
        (gdb)
      
      Make sure the ranges array becomes a single element allocated.
      
       [ bp: Write a proper commit message. ]
      
      Fixes: dd5f7260 ("kexec: support for kexec on panic using new system call")
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarDave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/725fa3dc1da2737f0f6188a1a9701bead257ea9d.camel@gmx.de
      5849cdf8
  8. Apr 18, 2021
    • Fredrik Strupe's avatar
      ARM: 9071/1: uprobes: Don't hook on thumb instructions · d2f7eca6
      Fredrik Strupe authored
      
      Since uprobes is not supported for thumb, check that the thumb bit is
      not set when matching the uprobes instruction hooks.
      
      The Arm UDF instructions used for uprobes triggering
      (UPROBE_SWBP_ARM_INSN and UPROBE_SS_ARM_INSN) coincidentally share the
      same encoding as a pair of unallocated 32-bit thumb instructions (not
      UDF) when the condition code is 0b1111 (0xf). This in effect makes it
      possible to trigger the uprobes functionality from thumb, and at that
      using two unallocated instructions which are not permanently undefined.
      
      Signed-off-by: default avatarFredrik Strupe <fredrik@strupe.net>
      Cc: stable@vger.kernel.org
      Fixes: c7edc9e3 ("ARM: add uprobes support")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      d2f7eca6
  9. Apr 16, 2021
  10. Apr 15, 2021
  11. Apr 13, 2021
  12. Apr 12, 2021
    • Catalin Marinas's avatar
      arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically · 2decad92
      Catalin Marinas authored
      
      The entry from EL0 code checks the TFSRE0_EL1 register for any
      asynchronous tag check faults in user space and sets the
      TIF_MTE_ASYNC_FAULT flag. This is not done atomically, potentially
      racing with another CPU calling set_tsk_thread_flag().
      
      Replace the non-atomic ORR+STR with an STSET instruction. While STSET
      requires ARMv8.1 and an assembler that understands LSE atomics, the MTE
      feature is part of ARMv8.5 and already requires an updated assembler.
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Fixes: 637ec831 ("arm64: mte: Handle synchronous and asynchronous tag check faults")
      Cc: <stable@vger.kernel.org> # 5.10.x
      Reported-by: default avatarWill Deacon <will@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Link: https://lore.kernel.org/r/20210409173710.18582-1-catalin.marinas@arm.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      2decad92
    • Vasily Gorbik's avatar
      s390/entry: save the caller of psw_idle · a994eddb
      Vasily Gorbik authored
      
      Currently psw_idle does not allocate a stack frame and does not
      save its r14 and r15 into the save area. Even though this is valid from
      call ABI point of view, because psw_idle does not make any calls
      explicitly, in reality psw_idle is an entry point for controlled
      transition into serving interrupts. So, in practice, psw_idle stack
      frame is analyzed during stack unwinding. Depending on build options
      that r14 slot in the save area of psw_idle might either contain a value
      saved by previous sibling call or complete garbage.
      
        [task    0000038000003c28] do_ext_irq+0xd6/0x160
        [task    0000038000003c78] ext_int_handler+0xba/0xe8
        [task   *0000038000003dd8] psw_idle_exit+0x0/0x8 <-- pt_regs
       ([task    0000038000003dd8] 0x0)
        [task    0000038000003e10] default_idle_call+0x42/0x148
        [task    0000038000003e30] do_idle+0xce/0x160
        [task    0000038000003e70] cpu_startup_entry+0x36/0x40
        [task    0000038000003ea0] arch_call_rest_init+0x76/0x80
      
      So, to make a stacktrace nicer and actually point for the real caller of
      psw_idle in this frequently occurring case, make psw_idle save its r14.
      
        [task    0000038000003c28] do_ext_irq+0xd6/0x160
        [task    0000038000003c78] ext_int_handler+0xba/0xe8
        [task   *0000038000003dd8] psw_idle_exit+0x0/0x6 <-- pt_regs
       ([task    0000038000003dd8] arch_cpu_idle+0x3c/0xd0)
        [task    0000038000003e10] default_idle_call+0x42/0x148
        [task    0000038000003e30] do_idle+0xce/0x160
        [task    0000038000003e70] cpu_startup_entry+0x36/0x40
        [task    0000038000003ea0] arch_call_rest_init+0x76/0x80
      
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      a994eddb
    • Vasily Gorbik's avatar
      s390/entry: avoid setting up backchain in ext|io handlers · b74e409e
      Vasily Gorbik authored
      
      Currently when interrupt arrives to cpu while in kernel context
      INT_HANDLER macro (used for ext_int_handler and io_int_handler)
      allocates new stack frame and pt_regs on the kernel stack and
      sets up the backchain to jump over the pt_regs to the frame which has
      been interrupted. This is not ideal to two reasons:
      
      1. This hides the fact that kernel stack contains interrupt frame in it
         and hence breaks arch_stack_walk_reliable(), which needs to know that to
         guarantee "reliability" and checks that there are no pt_regs on the way.
      
      2. It breaks the backchain unwinder logic, which assumes that the next
         stack frame after an interrupt frame is reliable, while it is not.
         In some cases (when r14 contains garbage) this leads to early unwinding
         termination with an error, instead of marking frame as unreliable
         and continuing.
      
      To address that, only set backchain to 0.
      
      Fixes: 56e62a73 ("s390: convert to generic entry")
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      b74e409e
  13. Apr 11, 2021
  14. Apr 09, 2021
  15. Apr 08, 2021
    • Piotr Krysiuk's avatar
      bpf, x86: Validate computation of branch displacements for x86-32 · 26f55a59
      Piotr Krysiuk authored
      
      The branch displacement logic in the BPF JIT compilers for x86 assumes
      that, for any generated branch instruction, the distance cannot
      increase between optimization passes.
      
      But this assumption can be violated due to how the distances are
      computed. Specifically, whenever a backward branch is processed in
      do_jit(), the distance is computed by subtracting the positions in the
      machine code from different optimization passes. This is because part
      of addrs[] is already updated for the current optimization pass, before
      the branch instruction is visited.
      
      And so the optimizer can expand blocks of machine code in some cases.
      
      This can confuse the optimizer logic, where it assumes that a fixed
      point has been reached for all machine code blocks once the total
      program size stops changing. And then the JIT compiler can output
      abnormal machine code containing incorrect branch displacements.
      
      To mitigate this issue, we assert that a fixed point is reached while
      populating the output image. This rejects any problematic programs.
      The issue affects both x86-32 and x86-64. We mitigate separately to
      ease backporting.
      
      Signed-off-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      26f55a59
    • Piotr Krysiuk's avatar
      bpf, x86: Validate computation of branch displacements for x86-64 · e4d4d456
      Piotr Krysiuk authored
      
      The branch displacement logic in the BPF JIT compilers for x86 assumes
      that, for any generated branch instruction, the distance cannot
      increase between optimization passes.
      
      But this assumption can be violated due to how the distances are
      computed. Specifically, whenever a backward branch is processed in
      do_jit(), the distance is computed by subtracting the positions in the
      machine code from different optimization passes. This is because part
      of addrs[] is already updated for the current optimization pass, before
      the branch instruction is visited.
      
      And so the optimizer can expand blocks of machine code in some cases.
      
      This can confuse the optimizer logic, where it assumes that a fixed
      point has been reached for all machine code blocks once the total
      program size stops changing. And then the JIT compiler can output
      abnormal machine code containing incorrect branch displacements.
      
      To mitigate this issue, we assert that a fixed point is reached while
      populating the output image. This rejects any problematic programs.
      The issue affects both x86-32 and x86-64. We mitigate separately to
      ease backporting.
      
      Signed-off-by: default avatarPiotr Krysiuk <piotras@gmail.com>
      Reviewed-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      e4d4d456
    • Paolo Bonzini's avatar
      KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp · 315f02c6
      Paolo Bonzini authored
      
      Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller
      will skip the TLB flush, which is wrong.  There are two ways to fix
      it:
      
      - since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush
        the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to
        use "flush |= ..."
      
      - or we can chain the flush argument through kvm_tdp_mmu_zap_sp down
        to __kvm_tdp_mmu_zap_gfn_range.  Note that kvm_tdp_mmu_zap_sp will
        neither yield nor flush, so flush would never go from true to
        false.
      
      This patch does the former to simplify application to stable kernels,
      and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.
      
      Cc: seanjc@google.com
      Fixes: 048f4980 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")
      Cc: <stable@vger.kernel.org> # 5.10.x: 048f4980: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
      Cc: <stable@vger.kernel.org> # 5.10.x: 33a31641: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      315f02c6
  16. Apr 07, 2021
  17. Apr 06, 2021
Loading