Skip to content
Snippets Groups Projects
  1. Dec 19, 2018
    • Christoffer Dall's avatar
      KVM: arm/arm64: Fixup the kvm_exit tracepoint · 71a7e47f
      Christoffer Dall authored
      
      The kvm_exit tracepoint strangely always reported exits as being IRQs.
      This seems to be because either the __print_symbolic or the tracepoint
      macros use a variable named idx.
      
      Take this chance to update the fields in the tracepoint to reflect the
      concepts in the arm64 architecture that we pass to the tracepoint and
      move the exception type table to the same location and header files as
      the exits code.
      
      We also clear out the exception code to 0 for IRQ exits (which
      translates to UNKNOWN in text) to make it slighyly less confusing to
      parse the trace output.
      
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      71a7e47f
    • Christoffer Dall's avatar
      KVM: arm/arm64: vgic: Consider priority and active state for pending irq · 9009782a
      Christoffer Dall authored
      
      When checking if there are any pending IRQs for the VM, consider the
      active state and priority of the IRQs as well.
      
      Otherwise we could be continuously scheduling a guest hypervisor without
      it seeing an IRQ.
      
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      9009782a
    • Gustavo A. R. Silva's avatar
      KVM: arm/arm64: vgic: Fix off-by-one bug in vgic_get_irq() · c23b2e6f
      Gustavo A. R. Silva authored
      
      When using the nospec API, it should be taken into account that:
      
      "...if the CPU speculates past the bounds check then
       * array_index_nospec() will clamp the index within the range of [0,
       * size)."
      
      The above is part of the header for macro array_index_nospec() in
      linux/nospec.h
      
      Now, in this particular case, if intid evaluates to exactly VGIC_MAX_SPI
      or to exaclty VGIC_MAX_PRIVATE, the array_index_nospec() macro ends up
      returning VGIC_MAX_SPI - 1 or VGIC_MAX_PRIVATE - 1 respectively, instead
      of VGIC_MAX_SPI or VGIC_MAX_PRIVATE, which, based on the original logic:
      
      	/* SGIs and PPIs */
      	if (intid <= VGIC_MAX_PRIVATE)
       		return &vcpu->arch.vgic_cpu.private_irqs[intid];
      
       	/* SPIs */
      	if (intid <= VGIC_MAX_SPI)
       		return &kvm->arch.vgic.spis[intid - VGIC_NR_PRIVATE_IRQS];
      
      are valid values for intid.
      
      Fix this by calling array_index_nospec() macro with VGIC_MAX_PRIVATE + 1
      and VGIC_MAX_SPI + 1 as arguments for its parameter size.
      
      Fixes: 41b87599 ("KVM: arm/arm64: vgic: fix possible spectre-v1 in vgic_get_irq()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      [dropped the SPI part which was fixed separately]
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      c23b2e6f
  2. Dec 18, 2018
  3. Oct 26, 2018
  4. Oct 18, 2018
  5. Oct 17, 2018
    • Mark Rutland's avatar
      KVM: arm64: Fix caching of host MDCR_EL2 value · da5a3ce6
      Mark Rutland authored
      
      At boot time, KVM stashes the host MDCR_EL2 value, but only does this
      when the kernel is not running in hyp mode (i.e. is non-VHE). In these
      cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
      lead to CONSTRAINED UNPREDICTABLE behaviour.
      
      Since we use this value to derive the MDCR_EL2 value when switching
      to/from a guest, after a guest have been run, the performance counters
      do not behave as expected. This has been observed to result in accesses
      via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
      counters, resulting in events not being counted. In these cases, only
      the fixed-purpose cycle counter appears to work as expected.
      
      Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.
      
      Cc: Christopher Dall <christoffer.dall@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 1e947bad ("arm64: KVM: Skip HYP setup when already running in HYP")
      Tested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      da5a3ce6
  6. Oct 16, 2018
  7. Oct 03, 2018
  8. Oct 01, 2018
  9. Sep 27, 2018
  10. Sep 18, 2018
  11. Sep 07, 2018
  12. Aug 22, 2018
    • Michal Hocko's avatar
      mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7
      Michal Hocko authored
      There are several blockable mmu notifiers which might sleep in
      mmu_notifier_invalidate_range_start and that is a problem for the
      oom_reaper because it needs to guarantee a forward progress so it cannot
      depend on any sleepable locks.
      
      Currently we simply back off and mark an oom victim with blockable mmu
      notifiers as done after a short sleep.  That can result in selecting a new
      oom victim prematurely because the previous one still hasn't torn its
      memory down yet.
      
      We can do much better though.  Even if mmu notifiers use sleepable locks
      there is no reason to automatically assume those locks are held.  Moreover
      majority of notifiers only care about a portion of the address space and
      there is absolutely zero reason to fail when we are unmapping an unrelated
      range.  Many notifiers do really block and wait for HW which is harder to
      handle and we have to bail out though.
      
      This patch handles the low hanging fruit.
      __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
      are not allowed to sleep if the flag is set to false.  This is achieved by
      using trylock instead of the sleepable lock for most callbacks and
      continue as long as we do not block down the call chain.
      
      I think we can improve that even further because there is a common pattern
      to do a range lookup first and then do something about that.  The first
      part can be done without a sleeping lock in most cases AFAICS.
      
      The oom_reaper end then simply retries if there is at least one notifier
      which couldn't make any progress in !blockable mode.  A retry loop is
      already implemented to wait for the mmap_sem and this is basically the
      same thing.
      
      The simplest way for driver developers to test this code path is to wrap
      userspace code which uses these notifiers into a memcg and set the hard
      limit to hit the oom.  This can be done e.g.  after the test faults in all
      the mmu notifier managed memory and set the hard limit to something really
      small.  Then we are looking for a proper process tear down.
      
      [akpm@linux-foundation.org: coding style fixes]
      [akpm@linux-foundation.org: minor code simplification]
      Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
      
      
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
      Reported-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Sudeep Dutt <sudeep.dutt@intel.com>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Felix Kuehling <felix.kuehling@amd.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      93065ac7
Loading