- Mar 26, 2024
-
-
Enable the Mali GPU in the RK3588 EVB1. This marks the GPU regulators as always-on, because the generic coupler regulator logic from the kernel can only handle them when they are marked as always-on. Technically it's okay to disable the regulators, when the GPU is not used. Considering the RK3588 EVB1 is not battery powered, the slightly increased power consumption for keeping the regulator always enabled is not a big deal. Thus it's better to enable GPU support than wait for a better solution. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by:
Sebastian Reichel <sebastian.reichel@collabora.com>
-
Enable the Mali GPU in the Rock 5B. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by:
Sebastian Reichel <sebastian.reichel@collabora.com>
-
- Mar 25, 2024
-
-
Add Mali GPU Node to the RK3588 SoC DT including GPU clock operating points Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by:
Sebastian Reichel <sebastian.reichel@collabora.com>
-
Sebastian Reichel authored
Enable support for Mali CSF-based GPUs, which is found on recent ARM SoCs, such as Rockchip or Mediatek. Signed-off-by:
Sebastian Reichel <sebastian.reichel@collabora.com>
-
Add an entry for the Panthor driver to the MAINTAINERS file. v6: - Add Maxime's and Heiko's acks v4: - Add Steve's R-b v3: - Add bindings document as an 'F:' line. - Add Steven and Liviu as co-maintainers. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-15-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Arm has introduced a new v10 GPU architecture that replaces the Job Manager interface with a new Command Stream Frontend. It adds firmware driven command stream queues that can be used by kernel and user space to submit jobs to the GPU. Add the initial schema for the device tree that is based on support for RK3588 SoC. The minimum number of clocks is one for the IP, but on Rockchip platforms they will tend to expose the semi-independent clocks for better power management. v6: - Add Maxime's and Heiko's acks v5: - Move the opp-table node under the gpu node v4: - Fix formatting issue v3: - Cleanup commit message to remove redundant text - Added opp-table property and re-ordered entries - Clarified power-domains and power-domain-names requirements for RK3588. - Cleaned up example Note: power-domains and power-domain-names requirements for other platforms are still work in progress, hence the bindings are left incomplete here. v2: - New commit Signed-off-by:
Liviu Dudau <liviu.dudau@arm.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Conor Dooley <conor+dt@kernel.org> Cc: devicetree@vger.kernel.org Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Rob Herring <robh@kernel.org> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-14-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Now that all blocks are available, we can add/update Kconfig/Makefile files to allow compilation. v6: - Add Maxime's and Heiko's acks - Keep source files alphabetically ordered in the Makefile v4: - Add Steve's R-b v3: - Add a dep on DRM_GPUVM - Fix dependencies in Kconfig - Expand help text to (hopefully) describe which GPUs are to be supported by this driver and which are for panfrost. Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-13-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
This is the last piece missing to expose the driver to the outside world. This is basically a wrapper between the ioctls and the other logical blocks. v6: - Add Maxime's and Heiko's acks - Return a page-aligned BO size to userspace - Keep header inclusion alphabetically ordered v5: - Account for the drm_exec_init() prototype change - Include platform_device.h v4: - Add an ioctl to let the UMD query the VM state - Fix kernel doc - Let panthor_device_init() call panthor_device_init() - Fix cleanup ordering in the panthor_init() error path - Add Steve's and Liviu's R-b v3: - Add acks for the MIT/GPL2 relicensing - Fix 32-bit support - Account for panthor_vm and panthor_sched changes - Simplify the resv preparation/update logic - Use a linked list rather than xarray for list of signals. - Simplify panthor_get_uobj_array by returning the newly allocated array. - Drop the "DOC" for job submission helpers and move the relevant comments to panthor_ioctl_group_submit(). - Add helpers sync_op_is_signal()/sync_op_is_wait(). - Simplify return type of panthor_submit_ctx_add_sync_signal() and panthor_submit_ctx_get_sync_signal(). - Drop WARN_ON from panthor_submit_ctx_add_job(). - Fix typos in comments. Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-12-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
This is the piece of software interacting with the FW scheduler, and taking care of some scheduling aspects when the FW comes short of slots scheduling slots. Indeed, the FW only expose a few slots, and the kernel has to give all submission contexts, a chance to execute their jobs. The kernel-side scheduler is timeslice-based, with a round-robin queue per priority level. Job submission is handled with a 1:1 drm_sched_entity:drm_gpu_scheduler, allowing us to delegate the dependency tracking to the core. All the gory details should be documented inline. v6: - Add Maxime's and Heiko's acks - Make sure the scheduler is initialized before queueing the tick work in the MMU fault handler - Keep header inclusion alphabetically ordered v5: - Fix typos - Call panthor_kernel_bo_destroy(group->syncobjs) unconditionally - Don't move the group to the waiting list tail when it was already waiting for a different syncobj - Fix fatal_queues flagging in the tiler OOM path - Don't warn when more than one job timesout on a group - Add a warning message when we fail to allocate a heap chunk - Add Steve's R-b v4: - Check drmm_mutex_init() return code - s/drm_gem_vmap_unlocked/drm_gem_vunmap_unlocked/ in panthor_queue_put_syncwait_obj() - Drop unneeded WARN_ON() in cs_slot_sync_queue_state_locked() - Use atomic_xchg() instead of atomic_fetch_and(0) - Fix typos - Let panthor_kernel_bo_destroy() check for IS_ERR_OR_NULL() BOs - Defer TILER_OOM event handling to a separate workqueue to prevent deadlocks when the heap chunk allocation is blocked on mem-reclaim. This is just a temporary solution, until we add support for non-blocking/failable allocations - Pass the scheduler workqueue to drm_sched instead of instantiating a separate one (no longer needed now that heap chunk allocation happens on a dedicated wq) - Set WQ_MEM_RECLAIM on the scheduler workqueue, so we can handle job timeouts when the system is under mem pressure, and hopefully free up some memory retained by these jobs v3: - Rework the FW event handling logic to avoid races - Make sure MMU faults kill the group immediately - Use the panthor_kernel_bo abstraction for group/queue buffers - Make in_progress an atomic_t, so we can check it without the reset lock held - Don't limit the number of groups per context to the FW scheduler capacity. Fix the limit to 128 for now. - Add a panthor_job_vm() helper - Account for panthor_vm changes - Add our job fence as DMA_RESV_USAGE_WRITE to all external objects (was previously DMA_RESV_USAGE_BOOKKEEP). I don't get why, given we're supposed to be fully-explicit, but other drivers do that, so there must be a good reason - Account for drm_sched changes - Provide a panthor_queue_put_syncwait_obj() - Unconditionally return groups to their idle list in panthor_sched_suspend() - Condition of sched_queue_{,delayed_}work fixed to be only when a reset isn't pending or in progress. - Several typos in comments fixed. Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-11-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Tiler heap growing requires some kernel driver involvement: when the tiler runs out of heap memory, it will raise an exception which is either directly handled by the firmware if some free heap chunks are available in the heap context, or passed back to the kernel otherwise. The heap helpers will be used by the scheduler logic to allocate more heap chunks to a heap context, when such a situation happens. Heap context creation is explicitly requested by userspace (using the TILER_HEAP_CREATE ioctl), and the returned context is attached to a queue through some command stream instruction. All the kernel does is keep the list of heap chunks allocated to a context, so they can be freed when TILER_HEAP_DESTROY is called, or extended when the FW requests a new chunk. v6: - Add Maxime's and Heiko's acks v5: - Fix FIXME comment - Add Steve's R-b v4: - Rework locking to allow concurrent calls to panthor_heap_grow() - Add a helper to return a heap chunk if we couldn't pass it to the FW because the group was scheduled out v3: - Add a FIXME for the heap OOM deadlock - Use the panthor_kernel_bo abstraction for the heap context and heap chunks - Drop the panthor_heap_gpu_ctx struct as it is opaque to the driver - Ensure that the heap context is aligned to the GPU cache line size - Minor code tidy ups Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-10-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Contains everything that's FW related, that includes the code dealing with the microcontroller unit (MCU) that's running the FW, and anything related to allocating memory shared between the FW and the CPU. A few global FW events are processed in the IRQ handler, the rest is forwarded to the scheduler, since scheduling is the primary reason for the FW existence, and also the main source of FW <-> kernel interactions. v6: - Add Maxime's and Heiko's acks - Keep header inclusion alphabetically ordered v5: - Fix typo in GLB_PERFCNT_SAMPLE definition - Fix unbalanced panthor_vm_idle/active() calls - Fallback to a slow reset when the fast reset fails - Add extra information when reporting a FW boot failure v4: - Add a MODULE_FIRMWARE() entry for gen 10.8 - Fix a wrong return ERR_PTR() in panthor_fw_load_section_entry() - Fix typos - Add Steve's R-b v3: - Make the FW path more future-proof (Liviu) - Use one waitqueue for all FW events - Simplify propagation of FW events to the scheduler logic - Drop the panthor_fw_mem abstraction and use panthor_kernel_bo instead - Account for the panthor_vm changes - Replace magic number with 0x7fffffff with ~0 to better signify that it's the maximum permitted value. - More accurate rounding when computing the firmware timeout. - Add a 'sub iterator' helper function. This also adds a check that a firmware entry doesn't overflow the firmware image. - Drop __packed from FW structures, natural alignment is good enough. - Other minor code improvements. Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-9-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
MMU and VM management is related and placed in the same source file. Page table updates are delegated to the io-pgtable-arm driver that's in the iommu subsystem. The VM management logic is based on drm_gpuva_mgr, and is assuming the VA space is mostly managed by the usermode driver, except for a reserved portion of this VA-space that's used for kernel objects (like the heap contexts/chunks). Both asynchronous and synchronous VM operations are supported, and internal helpers are exposed to allow other logical blocks to map their buffers in the GPU VA space. There's one VM_BIND queue per-VM (meaning the Vulkan driver can only expose one sparse-binding queue), and this bind queue is managed with a 1:1 drm_sched_entity:drm_gpu_scheduler, such that each VM gets its own independent execution queue, avoiding VM operation serialization at the device level (things are still serialized at the VM level). The rest is just implementation details that are hopefully well explained in the documentation. v6: - Add Maxime's and Heiko's acks - Add Steve's R-b - Adjust the TRANSCFG value to account for SW VA space limitation on 32-bit systems - Keep header inclusion alphabetically ordered v5: - Fix a double panthor_vm_cleanup_op_ctx() call - Fix a race between panthor_vm_prepare_map_op_ctx() and panthor_vm_bo_put() - Fix panthor_vm_pool_destroy_vm() kernel doc - Fix paddr adjustment in panthor_vm_map_pages() - Fix bo_offset calculation in panthor_vm_get_bo_for_va() v4: - Add an helper to return the VM state - Check drmm_mutex_init() return code - Remove the VM from the AS reclaim list when panthor_vm_active() is called - Count the number of active VM users instead of considering there's at most one user (several scheduling groups can point to the same vM) - Pre-allocate a VMA object for unmap operations (unmaps can trigger a sm_step_remap() call) - Check vm->root_page_table instead of vm->pgtbl_ops to detect if the io-pgtable is trying to allocate the root page table - Don't memset() the va_node in panthor_vm_alloc_va(), make it a caller requirement - Fix the kernel doc in a few places - Drop the panthor_vm::base offset constraint and modify panthor_vm_put() to explicitly check for a NULL value - Fix unbalanced vm_bo refcount in panthor_gpuva_sm_step_remap() - Drop stale comments about the shared_bos list - Patch mmu_features::va_bits on 32-bit builds to reflect the io_pgtable limitation and let the UMD know about it v3: - Add acks for the MIT/GPL2 relicensing - Propagate MMU faults to the scheduler - Move pages pinning/unpinning out of the dma_signalling path - Fix 32-bit support - Rework the user/kernel VA range calculation - Make the auto-VA range explicit (auto-VA range doesn't cover the full kernel-VA range on the MCU VM) - Let callers of panthor_vm_alloc_va() allocate the drm_mm_node (embedded in panthor_kernel_bo now) - Adjust things to match the latest drm_gpuvm changes (extobj tracking, resv prep and more) - Drop the per-AS lock and use slots_lock (fixes a race on vm->as.id) - Set as.id to -1 when reusing an address space from the LRU list - Drop misleading comment about page faults - Remove check for irq being assigned in panthor_mmu_unplug() Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-8-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Every thing related to devfreq in placed in panthor_devfreq.c, and helpers that can be called by other logical blocks are exposed through panthor_devfreq.h. This implementation is loosely based on the panfrost implementation, the only difference being that we don't count device users, because the idle/active state will be managed by the scheduler logic. v6: - Add Maxime's and Heiko's acks - Keep header inclusion alphabetically ordered v4: - Add Clément's A-b for the relicensing v3: - Add acks for the MIT/GPL2 relicensing v2: - Added in v2 Cc: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing Reviewed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Acked-by: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-7-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Anything relating to GEM object management is placed here. Nothing particularly interesting here, given the implementation is based on drm_gem_shmem_object, which is doing most of the work. v6: - Add Maxime's and Heiko's acks - Return a page-aligned BO size to userspace when creating a BO - Keep header inclusion alphabetically ordered v5: - Add Liviu's and Steve's R-b v4: - Force kernel BOs to be GPU mapped - Make panthor_kernel_bo_destroy() robust against ERR/NULL BO pointers to simplify the call sites v3: - Add acks for the MIT/GPL2 relicensing - Provide a panthor_kernel_bo abstraction for buffer objects managed by the kernel (will replace panthor_fw_mem and be used everywhere we were using panthor_gem_create_and_map() before) - Adjust things to match drm_gpuvm changes - Change return of panthor_gem_create_with_handle() to int Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-6-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Handles everything that's not related to the FW, the MMU or the scheduler. This is the block dealing with the GPU property retrieval, the GPU block power on/off logic, and some global operations, like global cache flushing. v6: - Add Maxime's and Heiko's acks v5: - Fix GPU_MODEL() kernel doc - Fix test in panthor_gpu_block_power_off() - Add Steve's R-b v4: - Expose CORE_FEATURES through DEV_QUERY v3: - Add acks for the MIT/GPL2 relicensing - Use macros to extract GPU ID info - Make sure we reset clear pending_reqs bits when wait_event_timeout() times out but the corresponding bit is cleared in GPU_INT_RAWSTAT (can happen if the IRQ is masked or HW takes to long to call the IRQ handler) - GPU_MODEL now takes separate arch and product majors to be more readable. - Drop GPU_IRQ_MCU_STATUS_CHANGED from interrupt mask. - Handle GPU_IRQ_PROTM_FAULT correctly (don't output registers that are not updated for protected interrupts). - Minor code tidy ups Cc: Alexey Sheplyakov <asheplyakov@basealt.ru> # MIT+GPL2 relicensing Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-5-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
The panthor driver is designed in a modular way, where each logical block is dealing with a specific HW-block or software feature. In order for those blocks to communicate with each other, we need a central panthor_device collecting all the blocks, and exposing some common features, like interrupt handling, power management, reset, ... This what this panthor_device logical block is about. v6: - Add Maxime's and Heiko's acks - Keep header inclusion alphabetically ordered v5: - Suspend the MMU/GPU blocks if panthor_fw_resume() fails in panthor_device_resume() - Move the pm_runtime_use_autosuspend() call before drm_dev_register() - Add Liviu's R-b v4: - Check drmm_mutex_init() return code - Fix panthor_device_reset_work() out path - Fix the race in the unplug logic - Fix typos - Unplug blocks when something fails in panthor_device_init() - Add Steve's R-b v3: - Add acks for the MIT+GPL2 relicensing - Fix 32-bit support - Shorten the sections protected by panthor_device::pm::mmio_lock to fix lock ordering issues. - Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to better reflect what this lock is protecting - Use dev_err_probe() - Make sure we call drm_dev_exit() when something fails half-way in panthor_device_reset_work() - Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a comment to explain. Also remove setting the dummy flush ID on suspend. - Remove drm_WARN_ON() in panthor_exception_name() - Check pirq->suspended in panthor_xxx_irq_raw_handler() Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-4-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Those are the registers directly accessible through the MMIO range. FW registers are exposed in panthor_fw.h. v6: - Add Maxime's and Heiko's acks v4: - Add the CORE_FEATURES register (needed for GPU variants) - Add Steve's R-b v3: - Add macros to extract GPU ID info - Formatting changes - Remove AS_TRANSCFG_ADRMODE_LEGACY - it doesn't exist post-CSF - Remove CSF_GPU_LATEST_FLUSH_ID_DEFAULT - Add GPU_L2_FEATURES_LINE_SIZE for extracting the GPU cache line size Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora Reviewed-by:
Steven Price <steven.price@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-3-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Panthor follows the lead of other recently submitted drivers with ioctls allowing us to support modern Vulkan features, like sparse memory binding: - Pretty standard GEM management ioctls (BO_CREATE and BO_MMAP_OFFSET), with the 'exclusive-VM' bit to speed-up BO reservation on job submission - VM management ioctls (VM_CREATE, VM_DESTROY and VM_BIND). The VM_BIND ioctl is loosely based on the Xe model, and can handle both asynchronous and synchronous requests - GPU execution context creation/destruction, tiler heap context creation and job submission. Those ioctls reflect how the hardware/scheduler works and are thus driver specific. We also have a way to expose IO regions, such that the usermode driver can directly access specific/well-isolate registers, like the LATEST_FLUSH register used to implement cache-flush reduction. This uAPI intentionally keeps usermode queues out of the scope, which explains why doorbell registers and command stream ring-buffers are not directly exposed to userspace. v6: - Add Maxime's and Heiko's acks v5: - Fix typo - Add Liviu's R-b v4: - Add a VM_GET_STATE ioctl - Fix doc - Expose the CORE_FEATURES register so we can deal with variants in the UMD - Add Steve's R-b v3: - Add the concept of sync-only VM operation - Fix support for 32-bit userspace - Rework drm_panthor_vm_create to pass the user VA size instead of the kernel VA size (suggested by Robin Murphy) - Typo fixes - Explicitly cast enums with top bit set to avoid compiler warnings in -pedantic mode. - Drop property core_group_count as it can be easily calculated by the number of bits set in l2_present. Co-developed-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Steven Price <steven.price@arm.com> Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Steven Price <steven.price@arm.com> Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Acked-by:
Maxime Ripard <mripard@kernel.org> Acked-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240229162230.2634044-2-boris.brezillon@collabora.com Signed-off-by:
Sebastian Reichel <sre@kernel.org>
-
Add CI support. This will do the following: 1. Run dt_binding_check to make sure no major flaws were introduced in the DT bindings 2. Run dtbs_check, for Rock 5A, Rock 5B and EVB1. If warnings are generated the CI will report that as warning 3. Build a Kernel .deb package 4. Generate a test job for LAVA and run it Co-developed-by:
Sebastian Reichel <sebastian.reichel@collabora.com> Co-developed-by:
Sjoerd Simons <sjoerd@collabora.com> Signed-off-by:
Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by:
Sjoerd Simons <sjoerd@collabora.com> Signed-off-by:
Christopher Obbard <chris.obbard@collabora.com>
-
- Mar 24, 2024
-
-
Heiko Stuebner authored
-
Heiko Stuebner authored
-
Andy Yan authored
According to the hardware design, the i2c address of audio codec es8316 on Cool Pi CM5 is 0x10. This fix the read/write error like bellow: es8316 7-0011: ASoC: error at soc_component_write_no_lock on es8316.7-0011 for register: [0x0000000c] -6 es8316 7-0011: ASoC: error at soc_component_write_no_lock on es8316.7-0011 for register: [0x00000003] -6 es8316 7-0011: ASoC: error at soc_component_read_no_lock on es8316.7-0011 for register: [0x00000016] -6 es8316 7-0011: ASoC: error at soc_component_read_no_lock on es8316.7-0011 for register: [0x00000016] -6 Fixes: 791c154c ("arm64: dts: rockchip: Add support for rk3588 based board Cool Pi CM5 EVB") Signed-off-by:
Andy Yan <andyshrk@163.com> Link: https://lore.kernel.org/r/20240324112833.2181961-1-andyshrk@163.com [also adapted the node name to audio-codec@10] Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Quentin Schulz authored
The PCIe PHY requires two regulators and are present on the SoM directly, while the PCIe connector also exposes 3V3 and 12V power rails which are available on the baseboard. Considering that 3/4 regulators are always-on on HW level and that the last one depends on a regulator from the PMIC that is specified as always on, this commit should be purely cosmetic and no change in behavior is expected. Let's add all regulators for PCIe on RK3399 Puma Haikou. Reviewed-by:
Dragan Simic <dsimic@manjaro.org> Signed-off-by:
Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20240308-puma-diode-pu-v2-3-309f83da110a@theobroma-systems.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Quentin Schulz authored
The PCIE_WAKE# has a diode used as a level-shifter, and is used as an input pin. While the SoC default is to enable the pull-up, the core rk3399 pinconf for this pin opted for pull-none. So as to not disturb the behaviour of other boards which may rely on pull-none instead of pull-up, set the needed pull-up only for RK3399 Puma. Fixes: 60fd9f72 ("arm64: dts: rockchip: add Haikou baseboard with RK3399-Q7 SoM") Signed-off-by:
Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20240308-puma-diode-pu-v2-2-309f83da110a@theobroma-systems.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Quentin Schulz authored
The Q7_USB_ID has a diode used as a level-shifter, and is used as an input pin. The SoC default for this pin is a pull-up, which is correct but the pinconf in the introducing commit missed that, so let's fix this oversight. Fixes: ed2c66a9 ("arm64: dts: rockchip: fix rk3399-puma-haikou USB OTG mode") Signed-off-by:
Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20240308-puma-diode-pu-v2-1-309f83da110a@theobroma-systems.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Iskander Amara authored
Nodes overridden by their reference should be ordered alphabetically to make it easier to read the DTS. pinctrl node is defined in the wrong location so let's reorder it. Signed-off-by:
Iskander Amara <iskander.amara@theobroma-systems.com> Reviewed-by:
Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20240308085243.69903-2-iskander.amara@theobroma-systems.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Iskander Amara authored
Q7_THRM# pin is connected to a diode on the module which is used as a level shifter, and the pin have a pull-down enabled by default. We need to configure it to internal pull-up, other- wise whenever the pin is configured as INPUT and we try to control it externally the value will always remain zero. Signed-off-by:
Iskander Amara <iskander.amara@theobroma-systems.com> Fixes: 2c66fc34 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM") Reviewed-by:
Quentin Schulz <quentin.schulz@theobroma-systems.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240308085243.69903-1-iskander.amara@theobroma-systems.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Dragan Simic authored
Add missing cache information to the Rockchip RK356x SoC dtsi, to allow the userspace, which includes lscpu(1) that uses the virtual files provided by the kernel under the /sys/devices/system/cpu directory, to display the proper RK3566 and RK3568 cache information. Adding the cache information to the RK356x SoC dtsi also makes the following warning message in the kernel log go away: cacheinfo: Unable to detect cache hierarchy for CPU 0 The cache parameters for the RK356x dtsi were obtained and partially derived by hand from the cache size and layout specifications found in the following datasheets and technical reference manuals: - Rockchip RK3566 datasheet, version 1.1 - Rockchip RK3568 datasheet, version 1.3 - ARM Cortex-A55 revision r1p0 TRM, version 0100-00 - ARM DynamIQ Shared Unit revision r4p0 TRM, version 0400-02 For future reference, here's a rather detailed summary of the documentation, which applies to both Rockchip RK3566 and RK3568 SoCs: - All caches employ the 64-byte cache line length - Each Cortex-A55 core has 32 KB of L1 4-way, set-associative instruction cache and 32 KB of L1 4-way, set-associative data cache - There are no L2 caches, which are per-core and private in Cortex-A55, because it belongs to the ARM DynamIQ IP core lineup - The entire SoC has 512 KB of unified L3 16-way, set-associative cache, which is shared among all four Cortex-A55 CPU cores - Cortex-A55 cores can be configured without private per-core L2 caches, in which case the shared L3 cache appears to them as an L2 cache; this is the case for the RK356x SoCs, so let's use "cache-level = <2>" to prevent the "huh, no L2 caches, but an L3 cache?" confusion among the users viewing the data presented to the userspace; another option could be to have additional 0 KB L2 caches defined, which may be technically correct, but would probably be even more confusing Helped-by:
Anand Moon <linux.amoon@gmail.com> Tested-By:
Diederik de Haas <didi.debian@cknow.org> Reviewed-by:
Anand Moon <linux.amoon@gmail.com> Signed-off-by:
Dragan Simic <dsimic@manjaro.org> Link: https://lore.kernel.org/r/2dee6dad8460b0c5f3b5da53cf55f735840efef1.1709957777.git.dsimic@manjaro.org Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Dragan Simic authored
Add missing cache information to the Rockchip RK3328 SoC dtsi, to allow the userspace, which includes lscpu(1) that uses the virtual files provided by the kernel under the /sys/devices/system/cpu directory, to display the proper RK3328 cache information. While there, use a more self-descriptive label for the L2 cache node, which also makes it more consistent with other SoC dtsi files. The cache parameters for the RK3328 dtsi were obtained and partially derived by hand from the cache size and layout specifications found in the following datasheets, official vendor websites, and technical reference manuals: - Rockchip RK3328 datasheet, version 1.4 - https://opensource.rock-chips.com/wiki_RK3328 , accessed on 2024-02-28 - ARM Cortex-A53 revision r0p3 TRM, version E For future reference, here's a brief summary of the documentation: - All caches employ the 64-byte cache line length - Each Cortex-A53 core has 32 KB of L1 2-way, set-associative instruction cache and 32 KB of L1 4-way, set-associative data cache - The entire SoC has 256 KB of unified L2 16-way, set-associative cache The RK3328 SoC dtsi is also used for the single RK3318-based supported board. Unfortunately, no datasheet is available for the RK3318, but some unofficial sources state that its L2 cache size is the same as RK3328's, so it's perhaps safe to assume the same for the L1 instruction and data cache sizes. Reviewed-by:
Anand Moon <linux.amoon@gmail.com> Signed-off-by:
Dragan Simic <dsimic@manjaro.org> Link: https://lore.kernel.org/r/a681b3c6dbf7b25b1527b11cea5ae0d6d1733714.1709958234.git.dsimic@manjaro.org Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Arınç ÜNAL authored
The MT7531 switch listens on PHY address 0x1f on an MDIO bus. I've got two findings that support this. There's no bootstrapping option to change the PHY address of the switch. The Linux driver hardcodes 0x1f as the PHY address of the switch. So the reg property on the device tree is currently ignored by the Linux driver. Therefore, describe the correct PHY address on Banana Pi BPI-R2 Pro that has this switch. Signed-off-by:
Arınç ÜNAL <arinc.unal@arinc9.com> Fixes: c1804463 ("arm64: dts: rockchip: Add mt7531 dsa node to BPI-R2-Pro board") Link: https://lore.kernel.org/r/20240314-for-rockchip-mt7531-phy-address-v1-1-743b5873358f@arinc9.com Signed-off-by:
Heiko Stuebner <heiko@sntech.de>
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efiLinus Torvalds authored
Pull EFI fixes from Ard Biesheuvel: - Fix logic that is supposed to prevent placement of the kernel image below LOAD_PHYSICAL_ADDR - Use the firmware stack in the EFI stub when running in mixed mode - Clear BSS only once when using mixed mode - Check efi.get_variable() function pointer for NULL before trying to call it * tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: fix panic in kdump kernel x86/efistub: Don't clear BSS twice in mixed mode x86/efistub: Call mixed mode boot services on the firmware's stack efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: - Ensure that the encryption mask at boot is properly propagated on 5-level page tables, otherwise the PGD entry is incorrectly set to non-encrypted, which causes system crashes during boot. - Undo the deferred 5-level page table setup as it cannot work with memory encryption enabled. - Prevent inconsistent XFD state on CPU hotplug, where the MSR is reset to the default value but the cached variable is not, so subsequent comparisons might yield the wrong result and as a consequence the result prevents updating the MSR. - Register the local APIC address only once in the MPPARSE enumeration to prevent triggering the related WARN_ONs() in the APIC and topology code. - Handle the case where no APIC is found gracefully by registering a fake APIC in the topology code. That makes all related topology functions work correctly and does not affect the actual APIC driver code at all. - Don't evaluate logical IDs during early boot as the local APIC IDs are not yet enumerated and the invoked function returns an error code. Nothing requires the logical IDs before the final CPUID enumeration takes place, which happens after the enumeration. - Cure the fallout of the per CPU rework on UP which misplaced the copying of boot_cpu_data to per CPU data so that the final update to boot_cpu_data got lost which caused inconsistent state and boot crashes. - Use copy_from_kernel_nofault() in the kprobes setup as there is no guarantee that the address can be safely accessed. - Reorder struct members in struct saved_context to work around another kmemleak false positive - Remove the buggy code which tries to update the E820 kexec table for setup_data as that is never passed to the kexec kernel. - Update the resource control documentation to use the proper units. - Fix a Kconfig warning observed with tinyconfig * tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/64: Move 5-level paging global variable assignments back x86/boot/64: Apply encryption mask to 5-level pagetable update x86/cpu: Add model number for another Intel Arrow Lake mobile processor x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD Documentation/x86: Document that resctrl bandwidth control units are MiB x86/mpparse: Register APIC address only once x86/topology: Handle the !APIC case gracefully x86/topology: Don't evaluate logical IDs during early boot x86/cpu: Ensure that CPU info updates are propagated on UP kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address x86/pm: Work around false positive kmemleak report in msr_build_context() x86/kexec: Do not update E820 kexec table for setup_data x86/config: Fix warning for 'make ARCH=x86_64 tinyconfig'
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler doc clarification from Thomas Gleixner: "A single update for the documentation of the base_slice_ns tunable to clarify that any value which is less than the tick slice has no effect because the scheduler tick is not guaranteed to happen within the set time slice" * tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/doc: Update documentation for base_slice_ns and CONFIG_HZ relation
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: "This has a set of swiotlb alignment fixes for sometimes very long standing bugs from Will. We've been discussion them for a while and they should be solid now" * tag 'dma-mapping-6.9-2024-03-24' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE iommu/dma: Force swiotlb_max_mapping_size on an untrusted device swiotlb: Fix alignment checks when both allocation and DMA masks are present swiotlb: Honour dma_alloc_coherent() alignment in swiotlb_alloc() swiotlb: Enforce page alignment in swiotlb_alloc() swiotlb: Fix double-allocation of slots due to broken alignment handling
-
Oleksandr Tymoshenko authored
Check if get_next_variable() is actually valid pointer before calling it. In kdump kernel this method is set to NULL that causes panic during the kexec-ed kernel boot. Tested with QEMU and OVMF firmware. Fixes: bad267f9 ("efi: verify that variable services are supported") Signed-off-by:
Oleksandr Tymoshenko <ovt@google.com> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Clearing BSS should only be done once, at the very beginning. efi_pe_entry() is the entrypoint from the firmware, which may not clear BSS and so it is done explicitly. However, efi_pe_entry() is also used as an entrypoint by the mixed mode startup code, in which case BSS will already have been cleared, and doing it again at this point will corrupt global variables holding the firmware's GDT/IDT and segment selectors. So make the memset() conditional on whether the EFI stub is running in native mode. Fixes: b3810c5a ("x86/efistub: Clear decompressor BSS in native EFI entrypoint") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Normally, the EFI stub calls into the EFI boot services using the stack that was live when the stub was entered. According to the UEFI spec, this stack needs to be at least 128k in size - this might seem large but all asynchronous processing and event handling in EFI runs from the same stack and so quite a lot of space may be used in practice. In mixed mode, the situation is a bit different: the bootloader calls the 32-bit EFI stub entry point, which calls the decompressor's 32-bit entry point, where the boot stack is set up, using a fixed allocation of 16k. This stack is still in use when the EFI stub is started in 64-bit mode, and so all calls back into the EFI firmware will be using the decompressor's limited boot stack. Due to the placement of the boot stack right after the boot heap, any stack overruns have gone unnoticed. However, commit 5c4feadb0011983b ("x86/decompressor: Move global symbol references to C code") moved the definition of the boot heap into C code, and now the boot stack is placed right at the base of BSS, where any overruns will corrupt the end of the .data section. While it would be possible to work around this by increasing the size of the boot stack, doing so would affect all x86 systems, and mixed mode systems are a tiny (and shrinking) fraction of the x86 installed base. So instead, record the firmware stack pointer value when entering from the 32-bit firmware, and switch to this stack every time a EFI boot service call is made. Cc: <stable@kernel.org> # v6.1+ Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Tom Lendacky authored
Commit 63bed966 ("x86/startup_64: Defer assignment of 5-level paging global variables") moved assignment of 5-level global variables to later in the boot in order to avoid having to use RIP relative addressing in order to set them. However, when running with 5-level paging and SME active (mem_encrypt=on), the variables are needed as part of the page table setup needed to encrypt the kernel (using pgd_none(), p4d_offset(), etc.). Since the variables haven't been set, the page table manipulation is done as if 4-level paging is active, causing the system to crash on boot. While only a subset of the assignments that were moved need to be set early, move all of the assignments back into check_la57_support() so that these assignments aren't spread between two locations. Instead of just reverting the fix, this uses the new RIP_REL_REF() macro when assigning the variables. Fixes: 63bed966 ("x86/startup_64: Defer assignment of 5-level paging global variables") Signed-off-by:
Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/2ca419f4d0de719926fd82353f6751f717590a86.1711122067.git.thomas.lendacky@amd.com
-
Tom Lendacky authored
When running with 5-level page tables, the kernel mapping PGD entry is updated to point to the P4D table. The assignment uses _PAGE_TABLE_NOENC, which, when SME is active (mem_encrypt=on), results in a page table entry without the encryption mask set, causing the system to crash on boot. Change the assignment to use _PAGE_TABLE instead of _PAGE_TABLE_NOENC so that the encryption mask is set for the PGD entry. Fixes: 533568e0 ("x86/boot/64: Use RIP_REL_REF() to access early_top_pgt[]") Signed-off-by:
Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/8f20345cda7dbba2cf748b286e1bc00816fe649a.1711122067.git.thomas.lendacky@amd.com
-