1. 06 Jan, 2018 2 commits
  2. 05 Jan, 2018 7 commits
  3. 04 Jan, 2018 6 commits
    • Thomas Gleixner's avatar
      x86/tlb: Drop the _GPL from the cpu_tlbstate export · 1e547681
      Thomas Gleixner authored
      The recent changes for PTI touch cpu_tlbstate from various tlb_flush
      inlines. cpu_tlbstate is exported as GPL symbol, so this causes a
      regression when building out of tree drivers for certain graphics cards.
      
      Aside of that the export was wrong since it was introduced as it should
      have been EXPORT_PER_CPU_SYMBOL_GPL().
      
      Use the correct PER_CPU export and drop the _GPL to restore the previous
      state which allows users to utilize the cards they payed for.
      
      As always I'm really thrilled to make this kind of change to support the
      #friends (or however the hot hashtag of today is spelled) from that closet
      sauce graphics corp.
      
      Fixes: 1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")
      Fixes: 6fd166aa
      
       ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
      Reported-by: default avatarKees Cook <keescook@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      1e547681
    • Peter Zijlstra's avatar
      x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers · 42f3bdc5
      Peter Zijlstra authored
      Thomas reported the following warning:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
       caller is native_flush_tlb_single+0x57/0xc0
       native_flush_tlb_single+0x57/0xc0
       __set_pte_vaddr+0x2d/0x40
       set_pte_vaddr+0x2f/0x40
       cea_set_pte+0x30/0x40
       ds_update_cea.constprop.4+0x4d/0x70
       reserve_ds_buffers+0x159/0x410
       x86_reserve_hardware+0x150/0x160
       x86_pmu_event_init+0x3e/0x1f0
       perf_try_init_event+0x69/0x80
       perf_event_alloc+0x652/0x740
       SyS_perf_event_open+0x3f6/0xd60
       do_syscall_64+0x5c/0x190
      
      set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
      there are two problems with that:
      
       1) The resulting flush is not supposed to be called in preemptible context
      
       2) The cpu entry area is supposed to be per CPU, but the debug store
          buffers are mapped for all CPUs so these mappings need to be flushed
          globally.
      
      Add the necessary preemption protection across the mapping code and flush
      TLBs globally.
      
      Fixes: c1961a46
      
       ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
      Reported-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
      42f3bdc5
    • Thomas Gleixner's avatar
      x86/kaslr: Fix the vaddr_end mess · 1dddd251
      Thomas Gleixner authored
      vaddr_end for KASLR is only documented in the KASLR code itself and is
      adjusted depending on config options. So it's not surprising that a change
      of the memory layout causes KASLR to have the wrong vaddr_end. This can map
      arbitrary stuff into other areas causing hard to understand problems.
      
      Remove the whole ifdef magic and define the start of the cpu_entry_area to
      be the end of the KASLR vaddr range.
      
      Add documentation to that effect.
      
      Fixes: 92a0f81d
      
       ("x86/cpu_entry_area: Move it out of the fixmap")
      Reported-by: default avatarBenjamin Gilbert <benjamin.gilbert@coreos.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarBenjamin Gilbert <benjamin.gilbert@coreos.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>,
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
      1dddd251
    • Thomas Gleixner's avatar
      x86/mm: Map cpu_entry_area at the same place on 4/5 level · f2078904
      Thomas Gleixner authored
      There is no reason for 4 and 5 level pagetables to have a different
      layout. It just makes determining vaddr_end for KASLR harder than
      necessary.
      
      Fixes: 92a0f81d
      
       ("x86/cpu_entry_area: Move it out of the fixmap")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>,
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
      f2078904
    • Andrey Ryabinin's avatar
      x86/mm: Set MODULES_END to 0xffffffffff000000 · f5a40711
      Andrey Ryabinin authored
      Since f06bdd40 ("x86/mm: Adapt MODULES_END based on fixmap section size")
      kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.
      
      So passing page unaligned address to kasan_populate_zero_shadow() have two
      possible effects:
      
      1) It may leave one page hole in supposed to be populated area. After commit
        21506525 ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
        hole happens to be in the shadow covering fixmap area and leads to crash:
      
       BUG: unable to handle kernel paging request at fffffbffffe8ee04
       RIP: 0010:check_memory_region+0x5c/0x190
      
       Call Trace:
        <NMI>
        memcpy+0x1f/0x50
        ghes_copy_tofrom_phys+0xab/0x180
        ghes_read_estatus+0xfb/0x280
        ghes_notify_nmi+0x2b2/0x410
        nmi_handle+0x115/0x2c0
        default_do_nmi+0x57/0x110
        do_nmi+0xf8/0x150
        end_repeat_nmi+0x1a/0x1e
      
      Note, the crash likely disappeared after commit 92a0f81d, which
      changed kasan_populate_zero_shadow() call the way it was before
      commit 21506525.
      
      2) Attempt to load module near MODULES_END will fail, because
         __vmalloc_node_range() called from kasan_module_alloc() will hit the
         WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.
      
      To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
      which means that MODULES_END should be 8*PAGE_SIZE aligned.
      
      The whole point of commit f06bdd40 was to move MODULES_END down if
      NR_CPUS is big, so the cpu_entry_area takes a lot of space.
      But since 92a0f81d ("x86/cpu_entry_area: Move it out of the fixmap")
      the cpu_entry_area is no longer in fixmap, so we could just set
      MODULES_END to a fixed 8*PAGE_SIZE aligned address.
      
      Fixes: f06bdd40
      
       ("x86/mm: Adapt MODULES_END based on fixmap section size")
      Reported-by: default avatarJakub Kicinski <kubakici@wp.pl>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
      f5a40711
    • Masahiro Yamada's avatar
      arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC · abb62c46
      Masahiro Yamada authored
      This is probably a copy-paste mistake.  The gpio-ranges of PXs3 is
      different from that of LD20.
      
      Fixes: 277b51e7
      
       ("arm64: dts: uniphier: add GPIO controller nodes")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      abb62c46
  4. 03 Jan, 2018 9 commits
  5. 02 Jan, 2018 4 commits
    • Helge Deller's avatar
      parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel · 88776c0e
      Helge Deller authored
      
      
      Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures
      about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9."
      
      Those opcodes evaluate to the ldcw() assembly instruction which requires
      (on 32bit) an alignment of 16 bytes to ensure atomicity.
      
      As it turns out, qemu is correct and in our assembly code in entry.S and
      pacache.S we don't pay attention to the required alignment.
      
      This patch fixes the problem by aligning the lock offset in assembly
      code in the same manner as we do in our C-code.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v4.0+
      88776c0e
    • Helge Deller's avatar
      parisc: Show initial kernel memory layout unhashed · 63b2c373
      Helge Deller authored
      Fixes: ad67b74d
      
       ("printk: hash addresses printed with %p")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      63b2c373
    • Helge Deller's avatar
      parisc: Show unhashed hardware inventory · 0ae60d0c
      Helge Deller authored
      Fixes: ad67b74d
      
       ("printk: hash addresses printed with %p")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      0ae60d0c
    • John Sperbeck's avatar
      powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR · ecb101ae
      John Sperbeck authored
      The recent refactoring of the powerpc page fault handler in commit
      c3350602 ("powerpc/mm: Make bad_area* helper functions") caused
      access to protected memory regions to indicate SEGV_MAPERR instead of
      the traditional SEGV_ACCERR in the si_code field of a user-space
      signal handler. This can confuse debug libraries that temporarily
      change the protection of memory regions, and expect to use SEGV_ACCERR
      as an indication to restore access to a region.
      
      This commit restores the previous behavior. The following program
      exhibits the issue:
      
          $ ./repro read  || echo "FAILED"
          $ ./repro write || echo "FAILED"
          $ ./repro exec  || echo "FAILED"
      
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>
          #include <unistd.h>
          #include <signal.h>
          #include <sys/mman.h>
          #include <assert.h>
      
          static void segv_handler(int n, siginfo_t *info, void *arg) {
                  _exit(info->si_code == SEGV_ACCERR ? 0 : 1);
          }
      
          int main(int argc, char **argv)
          {
                  void *p = NULL;
                  struct sigaction act = {
                          .sa_sigaction = segv_handler,
                          .sa_flags = SA_SIGINFO,
                  };
      
                  assert(argc == 2);
                  p = mmap(NULL, getpagesize(),
                          (strcmp(argv[1], "write") == 0) ? PROT_READ : 0,
                          MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
                  assert(p != MAP_FAILED);
      
                  assert(sigaction(SIGSEGV, &act, NULL) == 0);
                  if (strcmp(argv[1], "read") == 0)
                          printf("%c", *(unsigned char *)p);
                  else if (strcmp(argv[1], "write") == 0)
                          *(unsigned char *)p = 0;
                  else if (strcmp(argv[1], "exec") == 0)
                          ((void (*)(void))p)();
                  return 1;  /* failed to generate SEGV */
          }
      
      Fixes: c3350602
      
       ("powerpc/mm: Make bad_area* helper functions")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarJohn Sperbeck <jsperbeck@google.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Add commit references in change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ecb101ae
  6. 31 Dec, 2017 4 commits
    • Thomas Gleixner's avatar
      x86/ldt: Make LDT pgtable free conditional · 7f414195
      Thomas Gleixner authored
      
      
      Andy prefers to be paranoid about the pagetable free in the error path of
      write_ldt(). Make it conditional and warn whenever the installment of a
      secondary LDT fails.
      Requested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      7f414195
    • Thomas Gleixner's avatar
      x86/ldt: Plug memory leak in error path · a62d6985
      Thomas Gleixner authored
      
      
      The error path in write_ldt() tries to free 'old_ldt' instead of the newly
      allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a
      half populated LDT pagetable, which is not a leak as it gets cleaned up
      when the process exits.
      
      Free both the potentially half populated LDT pagetable and the newly
      allocated LDT struct. This can be done unconditionally because once an LDT
      is mapped subsequent maps will succeed, because the PTE page is already
      populated and the two LDTs fit into that single page.
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: f55f0501 ("x86/pti: Put the LDT in its own PGD if PTI is on")
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanos
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a62d6985
    • Thomas Gleixner's avatar
      x86/mm: Remove preempt_disable/enable() from __native_flush_tlb() · decab088
      Thomas Gleixner authored
      The preempt_disable/enable() pair in __native_flush_tlb() was added in
      commit:
      
        5cf0791d ("x86/mm: Disable preemption during CR3 read+write")
      
      ... to protect the UP variant of flush_tlb_mm_range().
      
      That preempt_disable/enable() pair should have been added to the UP variant
      of flush_tlb_mm_range() instead.
      
      The UP variant was removed with commit:
      
        ce4a4e56 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")
      
      ... but the preempt_disable/enable() pair stayed around.
      
      The latest change to __native_flush_tlb() in commit:
      
        6fd166aa
      
       ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
      
      ... added an access to a per CPU variable outside the preempt disabled
      regions, which makes no sense at all. __native_flush_tlb() must always
      be called with at least preemption disabled.
      
      Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
      bad callers independent of the smp_processor_id() debugging.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      decab088
    • Thomas Gleixner's avatar
      x86/smpboot: Remove stale TLB flush invocations · 322f8b8b
      Thomas Gleixner authored
      
      
      smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
      invoke local_flush_tlb() for no obvious reason.
      
      Digging in history revealed that the original code in the 2.1 era added
      those because the code manipulated a swapper_pg_dir pagetable entry. The
      pagetable manipulation was removed long ago in the 2.3 timeframe, but the
      TLB flush invocations stayed around forever.
      
      Remove them along with the pointless pr_debug()s which come from the same 2.1
      change.
      Reported-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      322f8b8b
  7. 29 Dec, 2017 4 commits
    • Thomas Gleixner's avatar
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI · bc976233
      Thomas Gleixner authored
      The new reservation mode for interrupts assigns a dummy vector when the
      interrupt is allocated and assigns a real vector when the interrupt is
      requested. The reservation mode prevents vector pressure when devices with
      a large amount of queues/interrupts are initialized, but only a minimal
      subset of those queues/interrupts is actually used.
      
      This mode has an issue with MSI interrupts which cannot be masked. If the
      driver is not careful or the hardware emits an interrupt before the device
      irq is requestd by the driver then the interrupt ends up on the dummy
      vector as a spurious interrupt which can cause malfunction of the device or
      in the worst case a lockup of the machine.
      
      Change the logic for the reservation mode so that the early activation of
      MSI interrupts checks whether:
      
       - the device is a PCI/MSI device
       - the reservation mode of the underlying irqdomain is activated
       - PCI/MSI masking is globally enabled
       - the PCI/MSI device uses either MSI-X, which supports masking, or
         MSI with the maskbit supported.
      
      If one of those conditions is false, then clear the reservation mode flag
      in the irq data of the interrupt and invoke irq_domain_activate_irq() with
      the reserve argument cleared. In the x86 vector code, clear the can_reserve
      flag in the vector allocation data so a subsequent free_irq() won't create
      the same situation again. The interrupt stays assigned to a real vector
      until pci_disable_msi() is invoked and all allocations are undone.
      
      Fixes: 4900be83
      
       ("x86/vector/msi: Switch to global reservation mode")
      Reported-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
      bc976233
    • Thomas Gleixner's avatar
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner authored
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643
      
       ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0
    • Thomas Gleixner's avatar
      x86/vector: Use IRQD_CAN_RESERVE flag · 945f50a5
      Thomas Gleixner authored
      
      
      Set the new CAN_RESERVE flag when the initial reservation for an interrupt
      happens. The flag is used in a subsequent patch to disable reservation mode
      for a certain class of MSI devices.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      
      945f50a5
    • Thomas Gleixner's avatar
      x86/apic: Switch all APICs to Fixed delivery mode · a31e58e1
      Thomas Gleixner authored
      Some of the APIC incarnations are operating in lowest priority delivery
      mode. This worked as long as the vector management code allocated the same
      vector on all possible CPUs for each interrupt.
      
      Lowest priority delivery mode does not necessarily respect the affinity
      setting and may redirect to some other online CPU. This was documented
      somewhere in the old code and the conversion to single target delivery
      missed to update the delivery mode of the affected APIC drivers which
      results in spurious interrupts on some of the affected CPU/Chipset
      combinations.
      
      Switch the APIC drivers over to Fixed delivery mode and remove all
      leftovers of lowest priority delivery mode.
      
      Switching to Fixed delivery mode is not a problem on these CPUs because the
      kernel already uses Fixed delivery mode for IPIs. The reason for this is
      that th SDM explicitely forbids lowest prio mode for IPIs. The reason is
      obvious: If the irq routing does not honor destination targets in lowest
      prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be
      a fatal problem in many cases.
      
      As a consequence of this change, the apic::irq_delivery_mode field is now
      pointless, but this needs to be cleaned up in a separate patch.
      
      Fixes: fdba46ff
      
       ("x86/apic: Get rid of multi CPU affinity")
      Reported-by: vcaputo@pengaru.com
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: vcaputo@pengaru.com
      Cc: Pavel Machek <pavel@ucw.cz>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos
      a31e58e1
  8. 28 Dec, 2017 3 commits
  9. 27 Dec, 2017 1 commit
    • Linus Torvalds's avatar
      x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR) · ac461122
      Linus Torvalds authored
      Commit e802a51e ("x86/idt: Consolidate IDT invalidation") cleaned up
      and unified the IDT invalidation that existed in a couple of places.  It
      changed no actual real code.
      
      Despite not changing any actual real code, it _did_ change code generation:
      by implementing the common idt_invalidate() function in
      archx86/kernel/idt.c, it made the use of the function in
      arch/x86/kernel/machine_kexec_32.c be a real function call rather than an
      (accidental) inlining of the function.
      
      That, in turn, exposed two issues:
      
       - in load_segments(), we had incorrectly reset all the segment
         registers, which then made the stack canary load (which gcc does
         using offset of %gs) cause a trap.  Instead of %gs pointing to the
         stack canary, it will be the normal zero-based kernel segment, and
         the stack canary load will take a page fault at address 0x14.
      
       - to make this even harder to debug, we had invalidated the GDT just
         before calling idt_invalidate(), which meant that the fault happened
         with an invalid GDT, which in turn causes a triple fault and
         immediate reboot.
      
      Fix this by
      
       (a) not reloading the special segments in load_segments(). We currently
           don't do any percpu accesses (which would require %fs on x86-32) in
           this area, but there's no reason to think that we might not want to
           do them, and like %gs, it's pointless to break it.
      
       (b) doing idt_invalidate() before invalidating the GDT, to keep things
           at least _slightly_ more debuggable for a bit longer. Without a
           IDT, traps will not work. Without a GDT, traps also will not work,
           but neither will any segment loads etc. So in a very real sense,
           the GDT is even more core than the IDT.
      
      Fixes: e802a51e
      
       ("x86/idt: Consolidate IDT invalidation")
      Reported-and-tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.LFD.2.21.1712271143180.8572@i7.lan
      ac461122