1. 16 Mar, 2020 18 commits
  2. 12 Feb, 2020 1 commit
  3. 05 Feb, 2020 2 commits
  4. 27 Jan, 2020 12 commits
  5. 23 Jan, 2020 1 commit
    • Paolo Bonzini's avatar
      KVM: x86: fix overlap between SPTE_MMIO_MASK and generation · 56871d44
      Paolo Bonzini authored
      The SPTE_MMIO_MASK overlaps with the bits used to track MMIO
      generation number.  A high enough generation number would overwrite the
      SPTE_SPECIAL_MASK region and cause the MMIO SPTE to be misinterpreted.
      
      Likewise, setting bits 52 and 53 would also cause an incorrect generation
      number to be read from the PTE, though this was partially mitigated by the
      (useless if it weren't for the bug) removal of SPTE_SPECIAL_MASK from
      the spte in get_mmio_spte_generation.  Drop that removal, and replace
      it with a compile-time assertion.
      
      Fixes: 6eeb4ef0
      
       ("KVM: x86: assign two bits to track SPTE kinds")
      Reported-by: default avatarBen Gardon <bgardon@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      56871d44
  6. 21 Jan, 2020 2 commits
    • Sean Christopherson's avatar
      KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM · e30a7d62
      Sean Christopherson authored
      Remove the bogus 64-bit only condition from the check that disables MMIO
      spte optimization when the system supports the max PA, i.e. doesn't have
      any reserved PA bits.  32-bit KVM always uses PAE paging for the shadow
      MMU, and per Intel's SDM:
      
        PAE paging translates 32-bit linear addresses to 52-bit physical
        addresses.
      
      The kernel's restrictions on max physical addresses are limits on how
      much memory the kernel can reasonably use, not what physical addresses
      are supported by hardware.
      
      Fixes: ce88decf
      
       ("KVM: MMU: mmio page fault support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e30a7d62
    • Sean Christopherson's avatar
      KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks · b5c3c1b3
      Sean Christopherson authored
      Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize
      the checks as much as possible.  Move the check to a separate helper,
      __is_bad_mt_xwr(), which allows the guest_rsvd_check usage in
      paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is
      always zero for non-nEPT) while retaining the bitwise-OR of the current
      code for the shadow_zero_check in walk_shadow_page_get_mmio_spte().
      
      Add a comment for the bitwise-OR usage in the mmio spte walk to avoid
      future attempts to "fix" the code, which is what prompted this
      optimization in the first place[*].
      
      Opportunistically remove the superfluous '!= 0' and parantheses, and
      use BIT_ULL() instead of open coding its equivalent.
      
      The net effect is that code generation is largely unchanged for
      walk_shadow_page_get_mmio_spte(), marginally better for
      ept_prefetch_invalid_gpte(), and significantly improved for
      paging32/64_prefetch_invalid_gpte().
      
      Note, walk_shadow_page_get_mmio_spte() can't use a templated version of
      the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that
      KVM hasn't borked its EPT tables.  Even if it could be templated, the
      benefits of having a single implementation far outweight the few uops
      that would be saved for NPT or non-TDP paging, e.g. most compilers
      inline it all the way to up kvm_mmu_page_fault().
      
      [*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com
      
      
      
      Cc: Jim Mattson <jmattson@google.com>
      Cc: David Laight <David.Laight@ACULAB.COM>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b5c3c1b3
  7. 08 Jan, 2020 4 commits
    • Sean Christopherson's avatar
      KVM: x86/mmu: WARN if root_hpa is invalid when handling a page fault · 6948199a
      Sean Christopherson authored
      
      
      WARN if root_hpa is invalid when handling a page fault.  The check on
      root_hpa exists for historical reasons that no longer apply to the
      current KVM code base.
      
      Remove an equivalent debug-only warning in direct_page_fault(), whose
      existence more or less confirms that root_hpa should always be valid
      when handling a page fault.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6948199a
    • Sean Christopherson's avatar
      KVM: x86/mmu: WARN on an invalid root_hpa · 0c7a98e3
      Sean Christopherson authored
      WARN on the existing invalid root_hpa checks in __direct_map() and
      FNAME(fetch).  The "legitimate" path that invalidated root_hpa in the
      middle of a page fault is long since gone, i.e. it should no longer be
      impossible to invalidate in the middle of a page fault[*].
      
      The root_hpa checks were added by two related commits
      
        989c6b34 ("KVM: MMU: handle invalid root_hpa at __direct_map")
        37f6a4e2
      
       ("KVM: x86: handle invalid root_hpa everywhere")
      
      to fix a bug where nested_vmx_vmexit() could be called *in the middle*
      of a page fault.  At the time, vmx_interrupt_allowed(), which was and
      still is used by kvm_can_do_async_pf() via ->interrupt_allowed(),
      directly invoked nested_vmx_vmexit() to switch from L2 to L1 to emulate
      a VM-Exit on a pending interrupt.  Emulating the nested VM-Exit resulted
      in root_hpa being invalidated by kvm_mmu_reset_context() without
      explicitly terminating the page fault.
      
      Now that root_hpa is checked for validity by kvm_mmu_page_fault(), WARN
      on an invalid root_hpa to detect any flows that reset the MMU while
      handling a page fault.  The broken vmx_interrupt_allowed() behavior has
      long since been fixed and resetting the MMU during a page fault should
      not be considered legal behavior.
      
      [*] It's actually technically possible in FNAME(page_fault)() because it
          calls inject_page_fault() when the guest translation is invalid, but
          in that case the page fault handling is immediately terminated.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0c7a98e3
    • Sean Christopherson's avatar
      KVM: x86/mmu: Move root_hpa validity checks to top of page fault handler · ddce6208
      Sean Christopherson authored
      Add a check on root_hpa at the beginning of the page fault handler to
      consolidate several checks on root_hpa that are scattered throughout the
      page fault code.  This is a preparatory step towards eventually removing
      such checks altogether, or at the very least WARNing if an invalid root
      is encountered.  Remove only the checks that can be easily audited to
      confirm that root_hpa cannot be invalidated between their current
      location and the new check in kvm_mmu_page_fault(), and aren't currently
      protected by mmu_lock, i.e. keep the checks in __direct_map() and
      FNAME(fetch) for the time being.
      
      The root_hpa checks that are consolidate were all added by commit
      
        37f6a4e2 ("KVM: x86: handle invalid root_hpa everywhere")
      
      which was a follow up to a bug fix for __direct_map(), commit
      
        989c6b34
      
       ("KVM: MMU: handle invalid root_hpa at __direct_map")
      
      At the time, nested VMX had, in hindsight, crazy handling of nested
      interrupts and would trigger a nested VM-Exit in ->interrupt_allowed(),
      and thus unexpectedly reset the MMU in flows such as can_do_async_pf().
      
      Now that the wonky nested VM-Exit behavior is gone, the root_hpa checks
      are bogus and confusing, e.g. it's not at all obvious what they actually
      protect against, and at first glance they appear to be broken since many
      of them run without holding mmu_lock.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ddce6208
    • Sean Christopherson's avatar
      KVM: x86/mmu: Move calls to thp_adjust() down a level · 4cd071d1
      Sean Christopherson authored
      
      
      Move the calls to thp_adjust() down a level from the page fault handlers
      to the map/fetch helpers and remove the page count shuffling done in
      thp_adjust().
      
      Despite holding a reference to the underlying page while processing a
      page fault, the page fault flows don't actually rely on holding a
      reference to the page when thp_adjust() is called.  At that point, the
      fault handlers hold mmu_lock, which prevents mmu_notifier from completing
      any invalidations, and have verified no invalidations from mmu_notifier
      have occurred since the page reference was acquired (which is done prior
      to taking mmu_lock).
      
      The kvm_release_pfn_clean()/kvm_get_pfn() dance in thp_adjust() is a
      quirk that is necessitated because thp_adjust() modifies the pfn that is
      consumed by its caller.  Because the page fault handlers call
      kvm_release_pfn_clean() on said pfn, thp_adjust() needs to transfer the
      reference to the correct pfn purely for correctness when the pfn is
      released.
      
      Calling thp_adjust() from __direct_map() and FNAME(fetch) means the pfn
      adjustment doesn't change the pfn as seen by the page fault handlers,
      i.e. the pfn released by the page fault handlers is the same pfn that
      was returned by gfn_to_pfn().
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4cd071d1