 03 Dec, 2019 21 commits


Rohan Garg authored

Rohan Garg authored

Samuel Pitoiset authored
Fixes some CTS regressions. Fixes: e61a826f ("ac/llvm: fix pointer type for global atomics") Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Neil Armstrong authored
Tomeu:  Small rebase fixups Signedoffby: Neil Armstrong <narmstrong@baylibre.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Dave Airlie authored
This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Dave Airlie authored
just return 0 for unbound atomic operations. Reviewedby: Marek Olšák <marek.olsak@amd.com>

Alyssa Rosenzweig authored
This is no longer used. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Tomeu Vizoso authored
Now that the Mali T720 GPU is supoprted at the same level as the T760, test it on PINE64 H64 boards. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Alyssa Rosenzweig authored
During the past months, Panfrost has matured considerably and several tests stopped being flaky or failing at all. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Tomeu Vizoso authored
Support for this GPU is equal now to that of T760, so whitelist it. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Alyssa Rosenzweig authored
Make sure that the fragment is complete when writing it out. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Tomeu Vizoso authored
We need to always upload anyway. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Alyssa Rosenzweig authored
Fixes dEQPGLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Alyssa Rosenzweig authored
The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu:  Also disable tiling when !hierarchy and !vertex_count  Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte  Take into account tile size when calculating polygon list size for !hierarchy  Allow 0sized tiles in a single dimension Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Alyssa Rosenzweig authored
We've figured out most of the big pieces, and though it looks faintly like other Midgards, it's much simpler. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Tomeu Vizoso authored
Similarly to how it's already done in the compiler, add a way to express differences between GPU models that need to be taken into account when assembling the cmdstream. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Ian Romanick authored
Reviewedby: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 > 14653437 (0.05%) instructions in affected programs: 316166 > 309237 (2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: 7.91 7.23 95% mean confidence interval for instructions %change: 4.46% 3.99% Instructions are helped. total cycles in shared programs: 228571646 > 228549759 (<.01%) cycles in affected programs: 56239919 > 56218032 (0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: 41.51 7.29 95% mean confidence interval for cycles %change: 0.80% 0.49% Cycles are helped. LOST: 1 GAINED: 0

Ian Romanick authored
Since a is nonnegative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 > 14661104 (<.01%) instructions in affected programs: 7520 > 7484 (0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.52% 0.47% Instructions are helped. total cycles in shared programs: 228585416 > 228584806 (<.01%) cycles in affected programs: 56321 > 55711 (1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: 28.32 9.80 95% mean confidence interval for cycles %change: 1.63% 0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 > 152991075 (<.01%) cycles in affected programs: 11525 > 11523 (0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: 5.27 4.27 95% mean confidence interval for cycles %change: 0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.

Ian Romanick authored
I tried 2, 4, 6, 8, and 10. 8 seemed to be the sweet spot across all Intel platforms. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14736141 > 14661140 (0.51%) instructions in affected programs: 2272413 > 2197412 (3.30%) helped: 8416 HURT: 140 helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6 helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20% HURT stats (abs) min: 1 max: 140 x̄: 4.73 x̃: 1 HURT stats (rel) min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60% 95% mean confidence interval for instructions value: 9.36 8.17 95% mean confidence interval for instructions %change: 4.14% 3.99% Instructions are helped. total cycles in shared programs: 231560416 > 228585416 (1.28%) cycles in affected programs: 126536021 > 123561021 (2.35%) helped: 7092 HURT: 1898 helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159 helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77% HURT stats (abs) min: 1 max: 14518 x̄: 371.91 x̃: 36 HURT stats (rel) min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55% 95% mean confidence interval for cycles value: 514.34 147.50 95% mean confidence interval for cycles %change: 9.69% 9.14% Cycles are helped. total spills in shared programs: 5763 > 5848 (1.47%) spills in affected programs: 1797 > 1882 (4.73%) helped: 13 HURT: 13 total fills in shared programs: 17163 > 16931 (1.35%) fills in affected programs: 7214 > 6982 (3.22%) helped: 22 HURT: 19 total sends in shared programs: 730410 > 730246 (0.02%) sends in affected programs: 2705 > 2541 (6.06%) helped: 114 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88% 95% mean confidence interval for sends value: 1.55 1.33 95% mean confidence interval for sends %change: 7.90% 6.62% Sends are helped. LOST: 4 GAINED: 0 Sandy Bridge total instructions in shared programs: 10760511 > 10724637 (0.33%) instructions in affected programs: 961305 > 925431 (3.73%) helped: 3734 HURT: 110 helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8 helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95% HURT stats (abs) min: 1 max: 20 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52% 95% mean confidence interval for instructions value: 9.76 8.91 95% mean confidence interval for instructions %change: 4.90% 4.63% Instructions are helped. total cycles in shared programs: 153359411 > 152991077 (0.24%) cycles in affected programs: 11615401 > 11247067 (3.17%) helped: 2725 HURT: 1138 helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80 helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91% HURT stats (abs) min: 1 max: 4351 x̄: 69.69 x̃: 25 HURT stats (rel) min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47% 95% mean confidence interval for cycles value: 103.18 87.52 95% mean confidence interval for cycles %change: 4.57% 3.97% Cycles are helped. total sends in shared programs: 584038 > 583855 (0.03%) sends in affected programs: 3512 > 3329 (5.21%) helped: 157 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06% 95% mean confidence interval for sends value: 1.26 1.07 95% mean confidence interval for sends %change: 7.17% 5.87% Sends are helped. LOST: 23 GAINED: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122617 > 8111592 (0.14%) instructions in affected programs: 380503 > 369478 (2.90%) helped: 912 HURT: 86 helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9 helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36% 95% mean confidence interval for instructions value: 12.00 10.10 95% mean confidence interval for instructions %change: 3.56% 3.10% Instructions are helped. total cycles in shared programs: 188509780 > 188534398 (0.01%) cycles in affected programs: 7211542 > 7236160 (0.34%) helped: 859 HURT: 132 helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16 helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33% HURT stats (abs) min: 2 max: 1592 x̄: 489.67 x̃: 618 HURT stats (rel) min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26% 95% mean confidence interval for cycles value: 9.58 40.10 95% mean confidence interval for cycles %change: 0.65% 2.93% Cycles are HURT.

Ian Romanick authored
In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some ifstatements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 */ if ssa_25 { block block_1: /* preds: block_0 */ vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 /* succs: block_3 */ } else { block block_2: /* preds: block_0 */ /* succs: block_3 */ } block block_3: /* preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 backend SEL peephole pass still helps ~700 shaders in shaderdb with this change. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 > 14738910 (0.03%) instructions in affected programs: 156575 > 151791 (3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: 4.12 3.82 95% mean confidence interval for instructions %change: 5.35% 4.95% Instructions are helped. total cycles in shared programs: 231749141 > 231602916 (0.06%) cycles in affected programs: 2818975 > 2672750 (5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: 130.47 113.64 95% mean confidence interval for cycles %change: 14.85% 12.72% Cycles are helped. total sends in shared programs: 730495 > 730491 (<.01%) sends in affected programs: 46 > 42 (8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 > 8122617 (<.01%) instructions in affected programs: 14716 > 14576 (0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: 3.42 2.54 95% mean confidence interval for instructions %change: 3.28% 1.62% Instructions are helped. total cycles in shared programs: 188510100 > 188509780 (<.01%) cycles in affected programs: 58994 > 58674 (0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: 16.34 3.06 95% mean confidence interval for cycles %change: 2.46% 0.15% Cycles are helped.

Jordan Justen authored
Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: 8125d796 ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <mesastable@lists.freedesktop.org> Signedoffby: Jordan Justen <jordan.l.justen@intel.com> Ackedby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Francisco Jerez <currojerez@riseup.net>

 02 Dec, 2019 16 commits


Marek Olšák authored
Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com>

Marek Olšák authored
Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com>

Jonathan Gray authored
pthread_getcpuclockid() and clock_gettime() are also available on at least OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin. Signedoffby: Jonathan Gray <jsg@jsg.id.au> Signedoffby: Marek Olšák <marek.olsak@amd.com>

Jonathan Gray authored
Make use of the futex syscall added in OpenBSD 6.2. Signedoffby: Jonathan Gray <jsg@jsg.id.au> Signedoffby: Marek Olšák <marek.olsak@amd.com>

Kenneth Graunke authored
Enabling this option makes Intel Gen811 hardware load the 'iris' driver by default instead of the older 'i965' driver. Regardless of how this option is set, users can still override which driver the loader selects via two methods. The first is to create a ~/.drirc or /etc/drirc file with the following snippet: <driconf> <device driver="loader" kernel_driver="i915"> <option name="dri_driver" value="i965" /> </device> </driconf> The other option is to set an environment variable: export MESA_LOADER_DRIVER_OVERRIDE=i965 For now, "prefer_iris" defaults to i965 (the historical choice). A separate future patch will change the default driver to iris. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893Reviewedby: Eric Engestrom <eric.engestrom@intel.com> Reviewedby: Tapani Pälli <tapani.palli@intel.com> Reviewedby: Eric Anholt <eric@anholt.net>

Jonathan Marek authored
Fixes: df9f2adf ("turnip: add display wsi") Signedoffby: Jonathan Marek <jonathan@marek.ca> Reviewedby: Eric Anholt <eric@anholt.net>

Rhys Perry authored
Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: bcd14756 ('nir/lower_io_to_vector: add flat mode') Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Connor Abbott <cwabbott0@gmail.com>

Rhys Perry authored
Improves generated code of dEQPVK.graphicsfuzz.discandaddinfuncinloop because a loop exit phi can then be fixed to exec, removing copies and improving jump threading. No pipelinedb changes. Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Rhys Perry authored
ACO considers discards jumps and creates edges in the CFG for them but NIR does neither of these. This can be fixed instead by keeping track of whether a side of an IF had a break/discard, but this doesn't solve the issue with discards affecting loop exit phis. So this reworks phi handling a bit. Fixes these tests: dEQPVK.graphicsfuzz.discandaddinfuncinloop dEQPVK.graphicsfuzz.loopcalldiscard dEQPVK.graphicsfuzz.complexnestedloopsandcall Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Rhys Perry authored
Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Alejandro Piñeiro authored
Right now there are two copies of mm: * mesa/main/mm.[ch] * gallium/auxiliary/util/u_mm.[ch] At some point they splitted, and from the commit message it was not clear why it was not possible to have only one copy at a common place. Taking into account that was several years ago, Im assuming that it was not possible then. This change would allow to have one copy of the same code, and also being able to use that code out of mesa/main or gallium, if needed. This commit moves u_mm and removes mm, as u_mm has slightly more changes. Reviewedby: Jose Fonseca <jfonseca@vmware.com>

Rhys Perry authored
Fixes: 13ab63bb ('radv: Implement VK_EXT_buffer_device_address.') Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Rhys Perry authored
Stronger ordering is implemented in SPIRV>NIR with barriers. Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Rhys Perry authored
Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Kenneth Graunke authored
This exposes GL_TDFX_texture_compression_FXT1 support. It's ancient, only Intel GPUs appear to support it, and I seriously doubt anybody uses it. But i965 supports it, and it's trivial to do, so we may as well support it in the new iris driver as well. Reviewedby: Eric Anholt <eric@anholt.net>

Kenneth Graunke authored
Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format unification work. This was really most of the work of implementing the extension. We just need to handle it in a couple of places and expose the extension. v2: Reject the new formats in llvmpipe_is_format_supported to prevent crashes because it doesn't know how to handle the new formats. Reviewedby: Marek Olšák <marek.olsak@amd.com> [v1] Reviewedby: Eric Anholt <eric@anholt.net> [v1]

 01 Dec, 2019 1 commit


Dave Airlie authored
This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewedby: Marek Olšák <marek.olsak@amd.com>

 29 Nov, 2019 2 commits


Samuel Pitoiset authored
Moved to RADV. No pipelinedb changes. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Samuel Pitoiset authored
This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipelinedb (NAVI10/LLVM): SGPRS: 9043 > 9051 (0.09 %) VGPRS: 7272 > 7292 (0.28 %) Code Size: 638892 > 621628 (2.70 %) bytes LDS: 1333 > 1331 (0.15 %) blocks Max Waves: 1614 > 1608 (0.37 %) Found this while glancing at some F12019 shaders. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
