Select Git revision
panthor_sched.c
-
Steven Price authored
If a queue is already assigned to the hardware, then a newly submitted job can start straight away without waiting for the tick. However in this case the devfreq infrastructure isn't notified that the GPU is busy. By the time the tick happens the job might well have finished and no time will be accounted for the GPU being busy. Fix this by recording the GPU as busy directly in queue_run_job() in the case where there is a CSG assigned and therefore we just ring the doorbell. Fixes: de854881 ("drm/panthor: Add the scheduler logical block") Signed-off-by:
Steven Price <steven.price@arm.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703155646.80928-1-steven.price@arm.com
Steven Price authoredIf a queue is already assigned to the hardware, then a newly submitted job can start straight away without waiting for the tick. However in this case the devfreq infrastructure isn't notified that the GPU is busy. By the time the tick happens the job might well have finished and no time will be accounted for the GPU being busy. Fix this by recording the GPU as busy directly in queue_run_job() in the case where there is a CSG assigned and therefore we just ring the doorbell. Fixes: de854881 ("drm/panthor: Add the scheduler logical block") Signed-off-by:
Steven Price <steven.price@arm.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703155646.80928-1-steven.price@arm.com
panthor_sched.c 98.89 KiB
// SPDX-License-Identifier: GPL-2.0 or MIT
/* Copyright 2023 Collabora ltd. */
#include <drm/drm_drv.h>
#include <drm/drm_exec.h>
#include <drm/drm_gem_shmem_helper.h>
#include <drm/drm_managed.h>
#include <drm/gpu_scheduler.h>
#include <drm/panthor_drm.h>
#include <linux/build_bug.h>
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/dma-mapping.h>
#include <linux/dma-resv.h>
#include <linux/firmware.h>
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/iopoll.h>
#include <linux/iosys-map.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#include "panthor_devfreq.h"
#include "panthor_device.h"
#include "panthor_fw.h"
#include "panthor_gem.h"
#include "panthor_gpu.h"
#include "panthor_heap.h"
#include "panthor_mmu.h"
#include "panthor_regs.h"
#include "panthor_sched.h"
/**
* DOC: Scheduler
*
* Mali CSF hardware adopts a firmware-assisted scheduling model, where
* the firmware takes care of scheduling aspects, to some extent.
*
* The scheduling happens at the scheduling group level, each group
* contains 1 to N queues (N is FW/hardware dependent, and exposed
* through the firmware interface). Each queue is assigned a command
* stream ring buffer, which serves as a way to get jobs submitted to
* the GPU, among other things.
*
* The firmware can schedule a maximum of M groups (M is FW/hardware
* dependent, and exposed through the firmware interface). Passed
* this maximum number of groups, the kernel must take care of
* rotating the groups passed to the firmware so every group gets
* a chance to have his queues scheduled for execution.
*
* The current implementation only supports with kernel-mode queues.
* In other terms, userspace doesn't have access to the ring-buffer.
* Instead, userspace passes indirect command stream buffers that are
* called from the queue ring-buffer by the kernel using a pre-defined
* sequence of command stream instructions to ensure the userspace driver
* always gets consistent results (cache maintenance,
* synchronization, ...).
*
* We rely on the drm_gpu_scheduler framework to deal with job
* dependencies and submission. As any other driver dealing with a
* FW-scheduler, we use the 1:1 entity:scheduler mode, such that each
* entity has its own job scheduler. When a job is ready to be executed
* (all its dependencies are met), it is pushed to the appropriate
* queue ring-buffer, and the group is scheduled for execution if it
* wasn't already active.
*
* Kernel-side group scheduling is timeslice-based. When we have less
* groups than there are slots, the periodic tick is disabled and we