Skip to content
Snippets Groups Projects
  1. Mar 12, 2023
  2. Jan 10, 2023
  3. Dec 28, 2022
  4. May 16, 2022
  5. Apr 05, 2022
  6. Mar 22, 2022
  7. Feb 23, 2022
  8. Feb 19, 2022
    • Jue Wang's avatar
      x86/mce: Work around an erratum on fast string copy instructions · 8ca97812
      Jue Wang authored
      
      A rare kernel panic scenario can happen when the following conditions
      are met due to an erratum on fast string copy instructions:
      
      1) An uncorrected error.
      2) That error must be in first cache line of a page.
      3) Kernel must execute page_copy from the page immediately before that
      page.
      
      The fast string copy instructions ("REP; MOVS*") could consume an
      uncorrectable memory error in the cache line _right after_ the desired
      region to copy and raise an MCE.
      
      Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string
      copy and will avoid such spurious machine checks. However, that is less
      preferable due to the permanent performance impact. Considering memory
      poison is rare, it's desirable to keep fast string copy enabled until an
      MCE is seen.
      
      Intel has confirmed the following:
      1. The CPU erratum of fast string copy only applies to Skylake,
      Cascade Lake and Cooper Lake generations.
      
      Directly return from the MCE handler:
      2. Will result in complete execution of the "REP; MOVS*" with no data
      loss or corruption.
      3. Will not result in another MCE firing on the next poisoned cache line
      due to "REP; MOVS*".
      4. Will resume execution from a correct point in code.
      5. Will result in the same instruction that triggered the MCE firing a
      second MCE immediately for any other software recoverable data fetch
      errors.
      6. Is not safe without disabling the fast string copy, as the next fast
      string copy of the same buffer on the same CPU would result in a PANIC
      MCE.
      
      This should mitigate the erratum completely with the only caveat that
      the fast string copy is disabled on the affected hyper thread thus
      performance degradation.
      
      This is still better than the OS crashing on MCEs raised on an
      irrelevant process due to "REP; MOVS*' accesses in a kernel context,
      e.g., copy_page.
      
      Tested:
      
      Injected errors on 1st cache line of 8 anonymous pages of process
      'proc1' and observed MCE consumption from 'proc2' with no panic
      (directly returned).
      
      Without the fix, the host panicked within a few minutes on a
      random 'proc2' process due to kernel access from copy_page.
      
        [ bp: Fix comment style + touch ups, zap an unlikely(), improve the
          quirk function's readability. ]
      
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Link: https://lore.kernel.org/r/20220218013209.2436006-1-juew@google.com
      8ca97812
  9. Feb 13, 2022
    • Borislav Petkov's avatar
      x86/mce: Use arch atomic and bit helpers · f11445ba
      Borislav Petkov authored
      
      The arch helpers do not have explicit KASAN instrumentation. Use them in
      noinstr code.
      
      Inline a couple more functions with single call sites, while at it:
      
      mce_severity_amd_smca() has a single call-site which is noinstr so force
      the inlining and fix:
      
        vmlinux.o: warning: objtool: mce_severity_amd.constprop.0()+0xca: call to \
      	  mce_severity_amd_smca() leaves .noinstr.text section
      
      Always inline mca_msr_reg():
      
           text    data     bss     dec     hex filename
        16065240        128031326       36405368        180501934       ac23dae vmlinux.before
        16065240        128031294       36405368        180501902       ac23d8e vmlinux.after
      
      and mce_no_way_out() as the latter one is used only once, to fix:
      
        vmlinux.o: warning: objtool: mce_read_aux()+0x53: call to mca_msr_reg() leaves .noinstr.text section
        vmlinux.o: warning: objtool: do_machine_check()+0xc9: call to mce_no_way_out() leaves .noinstr.text section
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Link: https://lore.kernel.org/r/20220204083015.17317-4-bp@alien8.de
      f11445ba
  10. Feb 01, 2022
  11. Dec 13, 2021
  12. Nov 17, 2021
  13. Sep 23, 2021
  14. Sep 20, 2021
  15. Sep 14, 2021
    • Tony Luck's avatar
      x86/mce: Avoid infinite loop for copy from user recovery · 81065b35
      Tony Luck authored
      
      There are two cases for machine check recovery:
      
      1) The machine check was triggered by ring3 (application) code.
         This is the simpler case. The machine check handler simply queues
         work to be executed on return to user. That code unmaps the page
         from all users and arranges to send a SIGBUS to the task that
         triggered the poison.
      
      2) The machine check was triggered in kernel code that is covered by
         an exception table entry. In this case the machine check handler
         still queues a work entry to unmap the page, etc. but this will
         not be called right away because the #MC handler returns to the
         fix up code address in the exception table entry.
      
      Problems occur if the kernel triggers another machine check before the
      return to user processes the first queued work item.
      
      Specifically, the work is queued using the ->mce_kill_me callback
      structure in the task struct for the current thread. Attempting to queue
      a second work item using this same callback results in a loop in the
      linked list of work functions to call. So when the kernel does return to
      user, it enters an infinite loop processing the same entry for ever.
      
      There are some legitimate scenarios where the kernel may take a second
      machine check before returning to the user.
      
      1) Some code (e.g. futex) first tries a get_user() with page faults
         disabled. If this fails, the code retries with page faults enabled
         expecting that this will resolve the page fault.
      
      2) Copy from user code retries a copy in byte-at-time mode to check
         whether any additional bytes can be copied.
      
      On the other side of the fence are some bad drivers that do not check
      the return value from individual get_user() calls and may access
      multiple user addresses without noticing that some/all calls have
      failed.
      
      Fix by adding a counter (current->mce_count) to keep track of repeated
      machine checks before task_work() is called. First machine check saves
      the address information and calls task_work_add(). Subsequent machine
      checks before that task_work call back is executed check that the address
      is in the same page as the first machine check (since the callback will
      offline exactly one page).
      
      Expected worst case is four machine checks before moving on (e.g. one
      user access with page faults disabled, then a repeat to the same address
      with page faults enabled ... repeat in copy tail bytes). Just in case
      there is some code that loops forever enforce a limit of 10.
      
       [ bp: Massage commit message, drop noinstr, fix typo, extend panic
         messages. ]
      
      Fixes: 5567d11c ("x86/mce: Send #MC singal from task work")
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
      81065b35
  16. Sep 13, 2021
  17. Aug 24, 2021
    • Borislav Petkov's avatar
      x86/mce: Defer processing of early errors · 3bff147b
      Borislav Petkov authored
      
      When a fatal machine check results in a system reset, Linux does not
      clear the error(s) from machine check bank(s) - hardware preserves the
      machine check banks across a warm reset.
      
      During initialization of the kernel after the reboot, Linux reads, logs,
      and clears all machine check banks.
      
      But there is a problem. In:
      
        5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
      
      the call to mce_register_decode_chain() moved later in the boot
      sequence. This means that /dev/mcelog doesn't see those early error
      logs.
      
      This was partially fixed by:
      
        cd9c57ca ("x86/MCE: Dump MCE to dmesg if no consumers")
      
      which made sure that the logs were not lost completely by printing
      to the console. But parsing console logs is error prone. Users of
      /dev/mcelog should expect to find any early errors logged to standard
      places.
      
      Add a new flag MCP_QUEUE_LOG to machine_check_poll() to be used in early
      machine check initialization to indicate that any errors found should
      just be queued to genpool. When mcheck_late_init() is called it will
      call mce_schedule_work() to actually log and flush any errors queued in
      the genpool.
      
       [ Based on an original patch, commit message by and completely
         productized by Tony Luck. ]
      
      Fixes: 5de97c9f ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
      Reported-by: default avatarSumanth Kamatala <skamatala@juniper.net>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20210824003129.GA1642753@agluck-desk2.amr.corp.intel.com
      3bff147b
  18. Jun 29, 2021
  19. Mar 18, 2021
    • Ingo Molnar's avatar
      x86: Fix various typos in comments · d9f6e12f
      Ingo Molnar authored
      
      Fix ~144 single-word typos in arch/x86/ code comments.
      
      Doing this in a single commit should reduce the churn.
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: linux-kernel@vger.kernel.org
      d9f6e12f
  20. Feb 08, 2021
  21. Jan 12, 2021
  22. Jan 08, 2021
  23. Dec 01, 2020
Loading