- Jan 18, 2019
-
-
Tomeu Vizoso authored
-
- Nov 06, 2018
-
-
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
3 issues: -missing nested condition logic -negative flag doesn't work on constant regs (?) -GE/LT conditions were flipped on equality (makes a big difference because of how || is implemented by gallium)
-
-
-
on a20x the GPU will hang if this register is zero Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
a20x can only draw 65535 vertices at once. this fix only applies to triangles. Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
adds all the required logic for a20x hw binning to work Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
emulated fragcoord. a2xx has *some* hw support but it is not practical Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
writes to position export are mapped to a temp reg, code inserted at the end of vertex shaders to export the position and compute the memory exports for hw binning on a20x. C64 is the offset in the binning data, C65/C66 are viewport parameters, C67+i/C68+i are binning view parameters. C3+i is the binning data "pointer" - relative_addr=1 (in ir-a2xx) makes it not interfere with the other shader constants Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
this is for a2xx specific semantics (vertex id) and a basic SSA form Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
the two a20x GPUs tested are a200 in the imx51 and the imx53 (not a205). the 201 id is used for the imx51 (it only has 128kb gmem as opposed to the typical 256kb for a200) Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
this also adds a num_vsc_pipe which represents the number of pipes to use: this value is useful because more pipes has a higher cost (on a20x) Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
this introduces some tracking of the number of vertices drawn in the current batch: the draw command needs an offset to the start of the binning data Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
this patch brings a number of changes to ir2: -ir2 now generates CF clauses as necessary during assembly. this simplifies fd2_program/fd2_compiler and is necessary to implement optimization passes -ir2 now has separate vector/scalar instructions. this will make it easier to implementing scheduling of scalar+vector instructions together. dst_reg is also now seperate from src registers instead of a single list -ir2 now implements register allocation. this makes it possible to compile shaders which have more than 64 TGSI registers -ir2 now implements the following optimizations: removal of IN/OUT MOV instructions generated by TGSI and removal of unused instructions when some exports are disabled -ir2 now allows full 8-bit index for constants -ir2_alloc no longer allocates 4 times too many bytes Signed-off-by: Jonathan Marek <jonathan@marek.ca>
-
Add load percentage graphs for GPU and V4L2 processing units. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
-
Etnaviv on GC3000 supports ETC2, but there is currently no way to advertise it without going the full way to expose GLES 3.0 support. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
The _Current reference is said to point to the highest priority, complete and enabled texture object. Once the same texture object is unbound and removed from CurrentTex[target], update _Current as well. This is necessary to allow releasing all references on imported textures without having to issue another draw call first, by just deleting the textures. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
-
Releasing all fragment sampler views on glFinish allows to release all references on imported textures by calling glDeleteTextures and eglDestroyImage, without having to issue another draw call first. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
-
If the driver provides native support for YUV textures we can skip adding additional samplers and re-writing the shaders. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Unconditionally requesting both bindings can lead to premature failure to create a valid image. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
If the driver supports multi-planar formats natively we don't want to re-write the format of the planes on import. Split this out in a separate function for clarity. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Each time I have to touch the buffer import/export functions in the dri state tracker I get lost in the maze of functions converting between DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format. Rip it out and replace by a single table, which defines the correspondence between the different representations. Also this now stores all the known representations in the __DRIimageRec, to avoid the loss of information we currently have when importing a buffer with a fourcc, which doesn't have a corresponding dri format. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Currently all the EGL APIs are missing a way to specify how an imported dma-buf is intended to be used. Demanding the format to be both usable for sampling and rendering artificially restricts the list of formats a driver is able to import. Looking at how the Intel driver implements those DRI2 image APIs it doesn't distinguish between render or sampler compatible formats. So this patch aligns behavior between Intel and Gallium based drivers. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Currently we dispose any unneeded color buffers immediately if we detect that there are more unlocked buffers than we need. This can lead to feedback loops between the compositor and the application causing rapid toggling between double and tripple buffering. Scenario: 2 buffers already qeued to the compositor, egl/wayland allocates a new back buffer to avoid trottling, slowing down the frame, this allows the compositor to catch up and unlock both buffers, then EGL detects that there are more buffers than currently need, freeing the buffer, restartig the loop shortly after. To avoid wasting CPU time on rapidly freeing and reallocating color buffers break those feedback loops by letting the unneeded buffers sit around for a short while before disposing them. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
We try to avoid sharing all resources with KMS side of renderonly, as this adds some overhead that isn't really needed for most resources. If someone tries to validate a resource for scanout, this is a good indication that the sharing with the KMS side is actually needed. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
We don't have a fallback to get them into a render/sampler compatible layout (yet). Rejecting the import at least forwards this issue to the client application, which might have a way to deal with this. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
The GC320 without the 2D tiling feature doesn't support regular blits with YUV input, as well as the tiled output. So on those cores we need need to do a filter blit for the YUV->RGB conversion to a temporary linear buffer and then do a tiling blit into the texture buffer using the RS engine on the 3D core. Not the most efficient path, but at least gives us the same level of functionality as on the newer GC320 cores and looks the same to the application. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
The new 2D YUV blit needs this in some cases, so make it available. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
This allows color space conversion and tiling in a single step, as well as handling multi-planar formats like NV12, which are really useful when dealing with hardware video decoders. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
We weren't handling this flag at all, which broke some assumptions made by the users of the resource_create interface. As we can't render to a linear surface and the usefulness of yet another layout transition to handle this case seems limited, we only respect the flag when the resource isn't used for rendering. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
This adds a blit path using the 2D GPU for a linear YUV to tiled RGB blit. This allows to implement importing of planar YUV textures with a single copy. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
Imported resources might not start at offset 0 into the buffer object. Make sure to remember the offset that is provided with the handle on import. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
We copy the template resource content into the newly allocated resource. If the template derived from a planar resource this leads to a non reference counted copy of the next resource pointer. Make sure to clear this out when allocating a new resource. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
The 2D pipe is useful to implement fast planar and interleaved YUV buffer imports. Not all systems have a 2D capable core, so this is strictly optional. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-
For the blitter's passthrough shaders use the TEXCOORD semantic instead of GENERIC. This works around a flat shading issue in the etnaviv driver, so that mipmap generation works again. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
-
This fixes point sprite texture coordinates while keeping flat-shading working. Suggested-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
-