- Dec 06, 2017
-
-
Christian König authored
The block size only affects the leave nodes, everything else is fixed. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
The block size only affects the leave nodes, everything else is fixed. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
It's pointless to have the same value twice, just always use max_pfn. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Dec 04, 2017
-
-
Monk Liu authored
this member will be used later, it will points to the real var inside of context and CS_SUBMIT & gpu schdduler can decide if skip a job depends on context->guilty or *entity->guilty Signed-off-by:
Monk Liu <Monk.Liu@amd.com> Reviewed-by:
Chunming Zhou <David1.Zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Nov 08, 2017
-
-
Dan Carpenter authored
After commit ea09729c ("drm/amdgpu: rework page directory filling v2") then it becomes a lot harder to verify that "r" is initialized. My static checker complains and so I've reviewed the code. It does look like it might be buggy... Anyway, it doesn't hurt to set "r" to zero at the start. Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- Oct 19, 2017
-
-
Christian König authored
Otherwise somebody could try to evict it at the same time and try to use half torn down structures. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by:
Michel Dänzer <michel.daenzer@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Oct 09, 2017
-
-
Andres Rodriguez authored
Introduce a flag to signal that access to a BO will be synchronized through an external mechanism. Currently all buffers shared between contexts are subject to implicit synchronization. However, this is only required for protocols that currently don't support an explicit synchronization mechanism (DRI2/3). This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that users can specify when it is safe to disable implicit sync. v2: only disable explicit sync in amdgpu_cs_ioctl Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Andres Rodriguez <andresx7@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Convert GTT mappings into linear ones for huge page handling. v2: use fragment size as minimum for linear conversion Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Yong Zhao authored
Without the additional bits set in PDEs/PTEs, the ATC memory access would have failed on Raven. Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Oct 06, 2017
-
-
Christian König authored
Fix two minor 80 char issues. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Sep 28, 2017
-
-
Felix Kuehling authored
When many wavefronts cause VM faults at the same time, it can overwhelm the interrupt handler and cause IH ring overflows before the driver can notify or kill the faulting application. As a workaround I'm introducing limited per-VM fault credit. After that number of VM faults have occurred, further VM faults are filtered out at the prescreen stage of processing. This depends on the PASID in the interrupt packet, so it currently only works for KFD contexts. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Sep 26, 2017
-
-
Yong Zhao authored
Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping(). Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Yong Zhao authored
When max_bytes is not 8 bytes aligned and bo size is larger than max_bytes, the last 8 bytes in a ttm node may be left unchanged. For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size is 0x200000, the problem will happen. In order to fix the problem, we separately store the max nums of PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs structure, rather than inferring it from bytes limit of SDMA constant fill, i.e. fill_max_bytes. Together with the fix, we replace the hard code value "10" in amdgpu_vm_bo_update_mapping() with the corresponding values from structure amdgpu_vm_pte_funcs. Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
IH tracks pending retry faults in a hash table for fast lookup in interrupt context. Each VM has a short FIFO of pending VM faults for processing in a bottom half. The IH prescreening stage adds retry faults and filters out repeated retry interrupts to minimize the impact of interrupt storms. It's the VM's responsibility remove pending faults once they are handled. For now this is only done when the VM is destroyed. v2: - Made the hash table smaller and the FIFO longer. I never want the FIFO to fill up, because that would make prescreen take longer. 128 pending page faults should be enough to keep migrations busy. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> (v1) Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
Allows assigning a PASID to a VM for identifying VMs involved in page faults. The global PASID manager is also exported in the KFD interface so that AMDGPU and KFD can share the PASID space. PASIDs of different sizes can be requested. On APUs, the PASID size is deterined by the capabilities of the IOMMU. So KFD must be able to allocate PASIDs in a smaller range. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Sep 13, 2017
-
-
Christian König authored
There is no guarantee that the last BO_VA actually needed an update. Additional to that all command submissions must wait for moved BOs to be cleared, not just the first one. v2: Don't overwrite any newer fence. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Sep 12, 2017
-
-
Christian König authored
All users of a VM must always wait for updates with always valid BOs to be completed. v2: remove debugging leftovers, rename struct member Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Roger He <Hongbo.He@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Use the VM instead of the BO list to find the BO for a virtual address. This fixes UVD/VCE in physical mode with VM local BOs. Signed-off-by:
Christian König <christian.koenig@amd.com> Acked-by:
Leo Liu <leo.liu@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Bas Nieuwenhuizen authored
When amdgpu_vm_frag_ptes calls amdgpu_vm_update_ptes and the pt has a shadow PT we mirror all the write to the shadow PT too, which results in twice the commands. Signed-off-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Sep 09, 2017
-
-
Davidlohr Bueso authored
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Jérôme Glisse <jglisse@redhat.com> Acked-by:
Christian König <christian.koenig@amd.com> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Doug Ledford <dledford@redhat.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Jason Wang <jasowang@redhat.com> Cc: Christian Benvenuti <benve@cisco.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Sep 01, 2017
-
-
Christian König authored
Only move BOs to the moved/relocated list when they aren't already on a list. This prevents accidential removal from the evicted list. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Roger He authored
This can improve performance for some cases. v2 (chk): handle all sizes, simplify the patch quite a bit v3 (chk): adjust dw estimation as well v4 (chk): use single loop, make end mask 64bit Signed-off-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Christian König <christian.koenig@amd.com> Tested-by:
Roger He <Hongbo.He@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Aug 31, 2017
-
-
Christian König authored
Per VM BOs are handled like VM PDs and PTs. They are always valid and don't need to be specified in the BO lists. v2: validate PDs/PTs first Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
We need to refer to the parent instead of the root BO for multi level page tables on Vega10. Also don't set the PDE_PTE bit. v2: Don't set the PDE_PTE bit either. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
The src isn't used any more after GART hack removal. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Keep track off relocated PDs/PTs instead of walking and checking all PDs. v2: fix root PD handling Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Aug 29, 2017
-
-
Christian König authored
Instead of validating all page tables when one was evicted, track which one needs a validation. v2: simplify amdgpu_vm_ready as well Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
We changed this to use an extra list a while back, but for the next series I need a separate flag again. v2: reorder to avoid unlocked list access Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Instead of using the vm_state use a separate flag to note that the BO was moved. v2: reorder patches to avoid temporary lockless access Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Stop checking the mapped BO itself, cause that one is certainly not a page table. Additional to that move the code into amdgpu_vm.c Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
That somehow got lost. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
This isn't used since we don't map evicted BOs to GART any more. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Set the shadow flag on the shadow and not the parent, always bind shadow BOs during allocation instead of manually, use the reservation_object wrappers to grab the lock. This fixes a couple of issues with binding the shadow BOs as well as correctly evicting them when memory becomes tight. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
Correctly detect system memory mappings when using CPU and don't use huge pages for them. Avoid incorrectly translating a physical page table GPU address when splitting a huge page while mapping system memory. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Aug 24, 2017
-
-
Christian König authored
This isn't used since we don't map evicted BOs to GART any more. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Set the shadow flag on the shadow and not the parent, always bind shadow BOs during allocation instead of manually, use the reservation_object wrappers to grab the lock. This fixes a couple of issues with binding the shadow BOs as well as correctly evicting them when memory becomes tight. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Chunming Zhou <david1.zhou@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Aug 23, 2017
-
-
Felix Kuehling authored
Correctly detect system memory mappings when using CPU and don't use huge pages for them. Avoid incorrectly translating a physical page table GPU address when splitting a huge page while mapping system memory. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- Aug 17, 2017
-
-
Roger He authored
Allow overrides on the command line. v2: agd: sqaush in spelling fix and bogus default value warning Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Roger He authored
adds fragment_size in the vm_manager structure and implements hardware setup for it. Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Roger He <Hongbo.He@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-