1. 15 Jun, 2020 1 commit
    • Qian Cai's avatar
      kvm/svm: disable KCSAN for svm_vcpu_run() · b95273f1
      Qian Cai authored
      
      
      For some reasons, running a simple qemu-kvm command with KCSAN will
      reset AMD hosts. It turns out svm_vcpu_run() could not be instrumented.
      Disable it for now.
      
       # /usr/libexec/qemu-kvm -name ubuntu-18.04-server-cloudimg -cpu host
      	-smp 2 -m 2G -hda ubuntu-18.04-server-cloudimg.qcow2
      
      === console output ===
      Kernel 5.6.0-next-20200408+ on an x86_64
      
      hp-dl385g10-05 login:
      
      <...host reset...>
      
      HPE ProLiant System BIOS A40 v1.20 (03/09/2018)
      (C) Copyright 1982-2018 Hewlett Packard Enterprise Development LP
      Early system initialization, please wait...
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Message-Id: <20200415153709.1559-1-cai@lca.pw>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b95273f1
  2. 11 Jun, 2020 1 commit
  3. 08 Jun, 2020 1 commit
    • Paolo Bonzini's avatar
      KVM: SVM: fix calls to is_intercept · fb7333df
      Paolo Bonzini authored
      
      
      is_intercept takes an INTERCEPT_* constant, not SVM_EXIT_*; because
      of this, the compiler was removing the body of the conditionals,
      as if is_intercept returned 0.
      
      This unveils a latent bug: when clearing the VINTR intercept,
      int_ctl must also be changed in the L1 VMCB (svm->nested.hsave),
      just like the intercept itself is also changed in the L1 VMCB.
      Otherwise V_IRQ remains set and, due to the VINTR intercept being clear,
      we get a spurious injection of a vector 0 interrupt on the next
      L2->L1 vmexit.
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fb7333df
  4. 01 Jun, 2020 12 commits
  5. 28 May, 2020 4 commits
    • Paolo Bonzini's avatar
      KVM: SVM: always update CR3 in VMCB · 978ce583
      Paolo Bonzini authored
      svm_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
      an optimization, but this is only correct before the nested vmentry.
      If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
      already been put in guest mode, the value of CR3 will not be updated.
      Remove the optimization, which almost never triggers anyway.
      This was was added in commit 689f3bf2 ("KVM: x86: unify callbacks
      to load paging root", 2020-03-16) just to keep the two vendor-specific
      modules closer, but we'll fix VMX too.
      
      Fixes: 689f3bf2
      
       ("KVM: x86: unify callbacks to load paging root")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      978ce583
    • Paolo Bonzini's avatar
      KVM: nSVM: remove exit_required · bd279629
      Paolo Bonzini authored
      
      
      All events now inject vmexits before vmentry rather than after vmexit.  Therefore,
      exit_required is not set anymore and we can remove it.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bd279629
    • Paolo Bonzini's avatar
      KVM: nSVM: inject exceptions via svm_check_nested_events · 7c86663b
      Paolo Bonzini authored
      
      
      This allows exceptions injected by the emulator to be properly delivered
      as vmexits.  The code also becomes simpler, because we can just let all
      L0-intercepted exceptions go through the usual path.  In particular, our
      emulation of the VMX #DB exit qualification is very much simplified,
      because the vmexit injection path can use kvm_deliver_exception_payload
      to update DR6.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c86663b
    • Paolo Bonzini's avatar
      KVM: x86: enable event window in inject_pending_event · c9d40913
      Paolo Bonzini authored
      
      
      In case an interrupt arrives after nested.check_events but before the
      call to kvm_cpu_has_injectable_intr, we could end up enabling the interrupt
      window even if the interrupt is actually going to be a vmexit.  This is
      useless rather than harmful, but it really complicates reasoning about
      SVM's handling of the VINTR intercept.  We'd like to never bother with
      the VINTR intercept if V_INTR_MASKING=1 && INTERCEPT_INTR=1, because in
      that case there is no interrupt window and we can just exit the nested
      guest whenever we want.
      
      This patch moves the opening of the interrupt window inside
      inject_pending_event.  This consolidates the check for pending
      interrupt/NMI/SMI in one place, and makes KVM's usage of immediate
      exits more consistent, extending it beyond just nested virtualization.
      
      There are two functional changes here.  They only affect corner cases,
      but overall they simplify the inject_pending_event.
      
      - re-injection of still-pending events will also use req_immediate_exit
      instead of using interrupt-window intercepts.  This should have no impact
      on performance on Intel since it simply replaces an interrupt-window
      or NMI-window exit for a preemption-timer exit.  On AMD, which has no
      equivalent of the preemption time, it may incur some overhead but an
      actual effect on performance should only be visible in pathological cases.
      
      - kvm_arch_interrupt_allowed and kvm_vcpu_has_events will return true
      if an interrupt, NMI or SMI is blocked by nested_run_pending.  This
      makes sense because entering the VM will allow it to make progress
      and deliver the event.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9d40913
  6. 27 May, 2020 2 commits
  7. 15 May, 2020 4 commits
  8. 13 May, 2020 13 commits
  9. 08 May, 2020 2 commits
    • Paolo Bonzini's avatar
      KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6 · d67668e9
      Paolo Bonzini authored
      
      
      There are two issues with KVM_EXIT_DEBUG on AMD, whose root cause is the
      different handling of DR6 on intercepted #DB exceptions on Intel and AMD.
      
      On Intel, #DB exceptions transmit the DR6 value via the exit qualification
      field of the VMCS, and the exit qualification only contains the description
      of the precise event that caused a vmexit.
      
      On AMD, instead the DR6 field of the VMCB is filled in as if the #DB exception
      was to be injected into the guest.  This has two effects when guest debugging
      is in use:
      
      * the guest DR6 is clobbered
      
      * the kvm_run->debug.arch.dr6 field can accumulate more debug events, rather
      than just the last one that happened (the testcase in the next patch covers
      this issue).
      
      This patch fixes both issues by emulating, so to speak, the Intel behavior
      on AMD processors.  The important observation is that (after the previous
      patches) the VMCB value of DR6 is only ever observable from the guest is
      KVM_DEBUGREG_WONT_EXIT is set.  Therefore we can actually set vmcb->save.dr6
      to any value we want as long as KVM_DEBUGREG_WONT_EXIT is clear, which it
      will be if guest debugging is enabled.
      
      Therefore it is possible to enter the guest with an all-zero DR6,
      reconstruct the #DB payload from the DR6 we get at exit time, and let
      kvm_deliver_exception_payload move the newly set bits into vcpu->arch.dr6.
      Some extra bits may be included in the payload if KVM_DEBUGREG_WONT_EXIT
      is set, but this is harmless.
      
      This may not be the most optimized way to deal with this, but it is
      simple and, being confined within SVM code, it gets rid of the set_dr6
      callback and kvm_update_dr6.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d67668e9
    • Paolo Bonzini's avatar
      KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6 · 5679b803
      Paolo Bonzini authored
      
      
      kvm_x86_ops.set_dr6 is only ever called with vcpu->arch.dr6 as the
      second argument.  Ensure that the VMCB value is synchronized to
      vcpu->arch.dr6 on #DB (both "normal" and nested) and nested vmentry, so
      that the current value of DR6 is always available in vcpu->arch.dr6.
      The get_dr6 callback can just access vcpu->arch.dr6 and becomes redundant.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5679b803