Skip to content
  • Sean Christopherson's avatar
    KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL · bf09fb6c
    Sean Christopherson authored
    Remove support for context switching between the guest's and host's
    desired UMWAIT_CONTROL.  Propagating the guest's value to hardware isn't
    required for correct functionality, e.g. KVM intercepts reads and writes
    to the MSR, and the latency effects of the settings controlled by the
    MSR are not architecturally visible.
    
    As a general rule, KVM should not allow the guest to control power
    management settings unless explicitly enabled by userspace, e.g. see
    KVM_CAP_X86_DISABLE_EXITS.  E.g. Intel's SDM explicitly states that C0.2
    can improve the performance of SMT siblings.  A devious guest could
    disable C0.2 so as to improve the performance of their workloads at the
    detriment to workloads running in the host or on other VMs.
    
    Wholesale removal of UMWAIT_CONTROL context switching also fixes a race
    condition where updates from the host may cause KVM to enter the guest
    with the incorrect value.  Because updates are are propagated to all
    CPUs via IPI (SMP function callback), the value in hardware may be
    stale with respect to the cached value and KVM could enter the guest
    with the wrong value in hardware.  As above, the guest can't observe the
    bad value, but it's a weird and confusing wart in the implementation.
    
    Removal also fixes the unnecessary usage of VMX's atomic load/store MSR
    lists.  Using the lists is only necessary for MSRs that are required for
    correct functionality immediately upon VM-Enter/VM-Exit, e.g. EFER on
    old hardware, or for MSRs that need to-the-uop precision, e.g. perf
    related MSRs.  For UMWAIT_CONTROL, the effects are only visible in the
    kernel via TPAUSE/delay(), and KVM doesn't do any form of delay in
    vcpu_vmx_run().  Using the atomic lists is undesirable as they are more
    expensive than direct RDMSR/WRMSR.
    
    Furthermore, even if giving the guest control of the MSR is legitimate,
    e.g. in pass-through scenarios, it's not clear that the benefits would
    outweigh the overhead.  E.g. saving and restoring an MSR across a VMX
    roundtrip costs ~250 cycles, and if the guest diverged from the host
    that cost would be paid on every run of the guest.  In other words, if
    there is a legitimate use case then it should be enabled by a new
    per-VM capability.
    
    Note, KVM still needs to emulate MSR_IA32_UMWAIT_CONTROL so that it can
    correctly expose other WAITPKG features to the guest, e.g. TPAUSE,
    UMWAIT and UMONITOR.
    
    Fixes: 6e3ba4ab
    
     ("KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL")
    Cc: stable@vger.kernel.org
    Cc: Jingqi Liu <jingqi.liu@intel.com>
    Cc: Tao Xu <tao3.xu@intel.com>
    Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
    Message-Id: <20200623005135.10414-1-sean.j.christopherson@intel.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    bf09fb6c