Commits · master · Robert Foss / mesa

Apr 20, 2016

anv: fix build without Wayland platform · 3caf2e89
Marcin Ślusarz authored 9 years ago
```
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
```
3caf2e89

anv: fix building on i686 with -mcpu=generic · 6c952d8a

Laurent Carlier authored 9 years ago


mcpu=generic doesn't enable sse2, and anvil definitly needs it

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

6c952d8a

spirv: Trivially handle the NonWriteable decoration · 2ef7aef3
Jason Ekstrand authored 9 years ago
```
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
```
2ef7aef3
nir: rename nir_foreach_block*() to nir_foreach_block*_call() · b6dc940e
Connor Abbott authored 9 years ago
```
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
```
b6dc940e

nvc0: avoid tex read fault from compute shaders on GK110 · 71430682

Samuel Pitoiset authored 9 years ago


After some investigation, it seems like that disabling the UNK02C4
command avoid a read fault with texelFetch() from a compute shader.

I have no clue on what this method actually does, but this avoid the
GPU to hang with basic-texelFetch.shader_test without introducing any
compute-related regressions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>

71430682

i965/vec4: Always split uniforms in array_access_to_pull_constants · 87a4fb51

Jason Ekstrand authored 9 years ago

Normally, we split uniforms at the end but in Vulkan, we bail because we
don't want pull constants. However, we still need them split because
pack_uniforms relies on it.

I really don't like this patch not because it doesn't work (it does) but
because now that we're using MOV_INDIRECT, uniform numbers and sizes don't
really matter anymore. In the FS backend, uniform splitting and packing is
handled all at once (actual re-assignment of locations happens later) and
we really should do it that way in vec4 eventually as well.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001

87a4fb51

i965/vec4: Use the correct offset for the swizzle shift in push constants · b3f43822

Jason Ekstrand authored 9 years ago


This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001

b3f43822

i965/vec4: Use nir_intrinsic_base in the load_uniform implementation · 9f16e170

Jason Ekstrand authored 9 years ago


We shouldn't be reading the const_index directly

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001

9f16e170

anv/apply_dynamic_offsets: Provide a range on the load_uniform · f63a9508

Jason Ekstrand authored 9 years ago


Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001

f63a9508

anv/lower_push_constants: Stop treating scalar specially · 35b758c3

Jason Ekstrand authored 9 years ago


All of the code that did something special based on vec4 vs. scalar is
bogus.  In the backend, everything is now in units of bytes and the vec4
backend can handle full std140 packing so we don't need to do anything
special anymore.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998

35b758c3

swr: fix resource backed constant buffers · 3bbe8a09

Tim Rowley authored 9 years ago

Code was using an incorrect address for the base pointer.

v2: use swr_resource_data() utility function.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979


Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Markus Wick <markus@selfnet.de>

3bbe8a09

nouveau: codegen: Add support for OpenCL global memory buffers · 2ac2ecdd

Hans de Goede authored 9 years ago


Add support for OpenCL global memory buffers, note this has only
been tested with regular load and stores and likely needs more work
for e.g. atomic ops.

Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

2ac2ecdd

nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers · 61d52a5f

Hans de Goede authored 9 years ago


Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only
apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for
OpenCL global buffers.

This commits changes the buffer code to use FILE_MEMORY_BUFFER at the
ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL
for use with OpenCL global buffers.

Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL
register file.

Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

61d52a5f

scons: Build dri_common_interop.c. · f02f4d09
José Fonseca authored 9 years ago

f02f4d09

st/dri: implement the GL interop DRI extension (v2.2) · 4fa3d35c

Marek Olšák authored 9 years ago

v2: - set interop_version
    - simplify the offset_after macro
v2.1: - use version numbers, remove offset_after
      - set "out_driver_data_written"
v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too
      - add whandle.offset to buf_offset
      - disable the minmax cache for GL_TEXTURE_BUFFER

4fa3d35c

glx: implement GLX part of interop interface (v2) · 37d3a26b
Marek Olšák authored 9 years ago
```
v2: - use const
```
37d3a26b
egl: implement EGL part of interop interface (v2) · b6eda708
Marek Olšák authored 9 years ago
```
v2: - use const
```
b6eda708
dri_interface: add interface for GL interop with other APIs (v2) · 5e9ed261
Marek Olšák authored 9 years ago
```
v2: - use const
```
5e9ed261

include/GL: add mesa_glinterop.h for OpenGL-OpenCL interop (v4.2) · 6eeb7294

Marek Olšák authored 9 years ago

v2: - use "enum" to define stuff
v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED
v4: - add mesa_glinterop_device_info::interop_version
    - more comments
    - remove #define MESA_GLINTEROP_VERSION
    - use const for "in"
v4.1: - use version numbers for structures
      - add "out_driver_data_written"
v4.2: - buf_offset & buf_size affect GL_ARRAY_BUFFER too, this is required
        for sharing suballocations within a larger buffer

6eeb7294

st/dri: Fix RGB565 EGLImage creation · 8093990e

Nicolas Dufresne authored 9 years ago


When creating egl images we do a bytes to pixel conversion by deviding
by 4 regardless of the pixel format. This does not work for RGB565. In
this patch, we avoid useless conversion and use proper API when the
conversion cannot be avoided.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

8093990e

st/dri: Factor out DRI2 to PIPE_FORMAT conversion · 4463f387

Nicolas Dufresne authored 9 years ago


This code is already duplicated twice and will be useful again. This
will also help when adding formats.

Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

4463f387

Apr 19, 2016

freedreno/a4xx: lower srgb in shader for astc textures · 899bd63a

Rob Clark authored 9 years ago


This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions.  But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures.  Work around this
by doing the srgb->linear conversion in the shader instead.

This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*

(The remaining fails seem to be a bug w/ ASTC + linear filtering, also
possibly a420.0 specific.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>

899bd63a

nir/lower-tex: add srgb->linear lowering · eddfc977

Rob Clark authored 9 years ago


Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

eddfc977

nir/builder: const'ify swiz param · eb00a0fc

Rob Clark authored 9 years ago


No need for it not to be const, and lets caller declare it const if
desired.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>

eb00a0fc

nir/lower-tex: make options a local var · 52ccc634
Rob Clark authored 9 years ago
```
Signed-off-by: Rob Clark <robclark@freedesktop.org>
```
52ccc634

freedreno: cleanup fd_set_sampler_views · d4ff42bd

Rob Clark authored 9 years ago


The separate FS/VS entrypoints are no longer used since a3ed98f7.  So
just inline them.

Signed-off-by: Rob Clark <robclark@freedesktop.org>

d4ff42bd

tgsi/lowering: improved lowering for LRP · fadfaa82

Russell King authored 9 years ago


Provide an improved lowering for LRP, which can be implemented in two
MAD instructions with a bit of rearranging of the equation, rather
than the literal implementation of two multiplies, an add and a
subtract.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

fadfaa82

tgsi/lowering: improved lowering for XPD · 67da7dd9

Russell King authored 9 years ago


Improve XPD lowering to consume less instructions by using the
MAD instruction to perform the multiply and subtraction together.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

67da7dd9

tgsi/lowering: add support for lowering TRUNC · 65460cf4

Russell King authored 9 years ago


Add support for lowering TRUNC using the following sequence:

	FRC tmpA, |src|
	SUB tmpA, |src|, tmpA
	CMP dst, -tmpA, tmpA

Note that this is incompatible with FRC lowering.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

65460cf4

tgsi/lowering: add support for lowering FLR and CEIL · 23e870a8

Russell King authored 9 years ago


Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD
instructions for GPUs that support FRC but not FLR or CEIL.  Since
these uses FRC, it is invalid to ask for FLR or CEIL to be lowered
along with FRC, so add an assert to catch this invalid configuration.

We also need to deal with FLR instructions emitted by the lowering
code.  Fix these up with the FRC+SUB equivalent when FLR lowering is
enabled.

Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>

23e870a8

radeonsi: enable TGSI support cap for compute shaders · 464cef5b

Bas Nieuwenhuizen authored 9 years ago


v2: Use chip_class instead of family.

v3: Check kernel version for SI.

v4: Preemptively allow amdgpu winsys for SI.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

464cef5b

radeonsi: Consider input SGPR count for compute shader SGPR count. · 1f32d5d5

Bas Nieuwenhuizen authored 9 years ago


si_shader_create corrects the SGPR count with si_fix_num_sgprs. We then
recompute the rsrc1 register to use the new SGPR count.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

1f32d5d5

radeonsi: Add CE synchronization for compute dispatches. · 6c833ba1

Bas Nieuwenhuizen authored 9 years ago


Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

6c833ba1

mesa/st: enable compute shaders if images are also supported · e0b729c5

Bas Nieuwenhuizen authored 9 years ago


v2: Also depend on atomic counters.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

e0b729c5

radeonsi: clean up compute flush · 41d79bcb

Bas Nieuwenhuizen authored 9 years ago


Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

41d79bcb

radeonsi: do not do two full flushes on every compute dispatch · 7a92c084

Bas Nieuwenhuizen authored 9 years ago


v2: Add more CS_PARTIAL_FLUSH events.

Essentially every place with waits on finishing for pixel shaders
also has a write after read hazard with compute shaders.

Invalidating L2 waits implicitly on pixel and compute shaders,
so, we don't need a CS_PARTIAL_FLUSH for switching FBO.

v3: Add CS_PARTIAL_FLUSH events even if we already have INV_GLOBAL_L2.

According to Marek the INV_GLOBAL_L2 events don't wait for compute
shaders to finish, so wait for them explicitly.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

7a92c084

radeonsi: split setting graphics and compute descriptors · e764ee13

Bas Nieuwenhuizen authored 9 years ago


Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

e764ee13

radeonsi: split texture decompression for compute shaders · 061ce939

Bas Nieuwenhuizen authored 9 years ago


Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

061ce939

radeonsi: update predicate condition for compute dispatches · e56514f6

Bas Nieuwenhuizen authored 9 years ago


Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

e56514f6

radeonsi: implement TGSI compute dispatch · c3083d84

Bas Nieuwenhuizen authored 9 years ago


v2: - Use radeon_set_sh_reg_seq.
    - Set predicate bit for conditional rendering.

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>

c3083d84