Fixes some CTS regressions. Fixes: e61a826f ("ac/llvm: fix pointer type for global atomics") Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Tomeu:  Small rebase fixups Signedoffby: Neil Armstrong <narmstrong@baylibre.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewedby: Marek Olšák <marek.olsak@amd.com>

just return 0 for unbound atomic operations. Reviewedby: Marek Olšák <marek.olsak@amd.com>

This is no longer used. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Now that the Mali T720 GPU is supoprted at the same level as the T760, test it on PINE64 H64 boards. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

During the past months, Panfrost has matured considerably and several tests stopped being flaky or failing at all. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Support for this GPU is equal now to that of T760, so whitelist it. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Make sure that the fragment is complete when writing it out. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

We need to always upload anyway. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Fixes dEQPGLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu:  Also disable tiling when !hierarchy and !vertex_count  Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte  Take into account tile size when calculating polygon list size for !hierarchy  Allow 0sized tiles in a single dimension Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

We've figured out most of the big pieces, and though it looks faintly like other Midgards, it's much simpler. Signedoffby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Similarly to how it's already done in the compiler, add a way to express differences between GPU models that need to be taken into account when assembling the cmdstream. Signedoffby: Tomeu Vizoso <tomeu.vizoso@collabora.com>

Reviewedby: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 > 14653437 (0.05%) instructions in affected programs: 316166 > 309237 (2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: 7.91 7.23 95% mean confidence interval for instructions %change: 4.46% 3.99% Instructions are helped. total cycles in shared programs: 228571646 > 228549759 (<.01%) cycles in affected programs: 56239919 > 56218032 (0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: 41.51 7.29 95% mean confidence interval for cycles %change: 0.80% 0.49% Cycles are helped. LOST: 1 GAINED: 0

Since a is nonnegative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 > 14661104 (<.01%) instructions in affected programs: 7520 > 7484 (0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %change: 0.52% 0.47% Instructions are helped. total cycles in shared programs: 228585416 > 228584806 (<.01%) cycles in affected programs: 56321 > 55711 (1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: 28.32 9.80 95% mean confidence interval for cycles %change: 1.63% 0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 > 152991075 (<.01%) cycles in affected programs: 11525 > 11523 (0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: 5.27 4.27 95% mean confidence interval for cycles %change: 0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.

I tried 2, 4, 6, 8, and 10. 8 seemed to be the sweet spot across all Intel platforms. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14736141 > 14661140 (0.51%) instructions in affected programs: 2272413 > 2197412 (3.30%) helped: 8416 HURT: 140 helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6 helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20% HURT stats (abs) min: 1 max: 140 x̄: 4.73 x̃: 1 HURT stats (rel) min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60% 95% mean confidence interval for instructions value: 9.36 8.17 95% mean confidence interval for instructions %change: 4.14% 3.99% Instructions are helped. total cycles in shared programs: 231560416 > 228585416 (1.28%) cycles in affected programs: 126536021 > 123561021 (2.35%) helped: 7092 HURT: 1898 helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159 helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77% HURT stats (abs) min: 1 max: 14518 x̄: 371.91 x̃: 36 HURT stats (rel) min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55% 95% mean confidence interval for cycles value: 514.34 147.50 95% mean confidence interval for cycles %change: 9.69% 9.14% Cycles are helped. total spills in shared programs: 5763 > 5848 (1.47%) spills in affected programs: 1797 > 1882 (4.73%) helped: 13 HURT: 13 total fills in shared programs: 17163 > 16931 (1.35%) fills in affected programs: 7214 > 6982 (3.22%) helped: 22 HURT: 19 total sends in shared programs: 730410 > 730246 (0.02%) sends in affected programs: 2705 > 2541 (6.06%) helped: 114 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88% 95% mean confidence interval for sends value: 1.55 1.33 95% mean confidence interval for sends %change: 7.90% 6.62% Sends are helped. LOST: 4 GAINED: 0 Sandy Bridge total instructions in shared programs: 10760511 > 10724637 (0.33%) instructions in affected programs: 961305 > 925431 (3.73%) helped: 3734 HURT: 110 helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8 helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95% HURT stats (abs) min: 1 max: 20 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52% 95% mean confidence interval for instructions value: 9.76 8.91 95% mean confidence interval for instructions %change: 4.90% 4.63% Instructions are helped. total cycles in shared programs: 153359411 > 152991077 (0.24%) cycles in affected programs: 11615401 > 11247067 (3.17%) helped: 2725 HURT: 1138 helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80 helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91% HURT stats (abs) min: 1 max: 4351 x̄: 69.69 x̃: 25 HURT stats (rel) min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47% 95% mean confidence interval for cycles value: 103.18 87.52 95% mean confidence interval for cycles %change: 4.57% 3.97% Cycles are helped. total sends in shared programs: 584038 > 583855 (0.03%) sends in affected programs: 3512 > 3329 (5.21%) helped: 157 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06% 95% mean confidence interval for sends value: 1.26 1.07 95% mean confidence interval for sends %change: 7.17% 5.87% Sends are helped. LOST: 23 GAINED: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122617 > 8111592 (0.14%) instructions in affected programs: 380503 > 369478 (2.90%) helped: 912 HURT: 86 helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9 helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36% 95% mean confidence interval for instructions value: 12.00 10.10 95% mean confidence interval for instructions %change: 3.56% 3.10% Instructions are helped. total cycles in shared programs: 188509780 > 188534398 (0.01%) cycles in affected programs: 7211542 > 7236160 (0.34%) helped: 859 HURT: 132 helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16 helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33% HURT stats (abs) min: 2 max: 1592 x̄: 489.67 x̃: 618 HURT stats (rel) min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26% 95% mean confidence interval for cycles value: 9.58 40.10 95% mean confidence interval for cycles %change: 0.65% 2.93% Cycles are HURT.

In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some ifstatements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 */ if ssa_25 { block block_1: /* preds: block_0 */ vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 /* succs: block_3 */ } else { block block_2: /* preds: block_0 */ /* succs: block_3 */ } block block_3: /* preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 backend SEL peephole pass still helps ~700 shaders in shaderdb with this change. Reviewedby: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewedby: Matt Turner <mattst88@gmail.com> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 > 14738910 (0.03%) instructions in affected programs: 156575 > 151791 (3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: 4.12 3.82 95% mean confidence interval for instructions %change: 5.35% 4.95% Instructions are helped. total cycles in shared programs: 231749141 > 231602916 (0.06%) cycles in affected programs: 2818975 > 2672750 (5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: 130.47 113.64 95% mean confidence interval for cycles %change: 14.85% 12.72% Cycles are helped. total sends in shared programs: 730495 > 730491 (<.01%) sends in affected programs: 46 > 42 (8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 > 8122617 (<.01%) instructions in affected programs: 14716 > 14576 (0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: 3.42 2.54 95% mean confidence interval for instructions %change: 3.28% 1.62% Instructions are helped. total cycles in shared programs: 188510100 > 188509780 (<.01%) cycles in affected programs: 58994 > 58674 (0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: 16.34 3.06 95% mean confidence interval for cycles %change: 2.46% 0.15% Cycles are helped.

Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: 8125d796 ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <mesastable@lists.freedesktop.org> Signedoffby: Jordan Justen <jordan.l.justen@intel.com> Ackedby: Kenneth Graunke <kenneth@whitecape.org> Reviewedby: Francisco Jerez <currojerez@riseup.net>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com>

Reviewedby: PierreEric PellouxPrayer <pierreeric.pellouxprayer@amd.com>

pthread_getcpuclockid() and clock_gettime() are also available on at least OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin. Signedoffby: Jonathan Gray <jsg@jsg.id.au> Signedoffby: Marek Olšák <marek.olsak@amd.com>

Make use of the futex syscall added in OpenBSD 6.2. Signedoffby: Jonathan Gray <jsg@jsg.id.au> Signedoffby: Marek Olšák <marek.olsak@amd.com>

Enabling this option makes Intel Gen811 hardware load the 'iris' driver by default instead of the older 'i965' driver. Regardless of how this option is set, users can still override which driver the loader selects via two methods. The first is to create a ~/.drirc or /etc/drirc file with the following snippet: <driconf> <device driver="loader" kernel_driver="i915"> <option name="dri_driver" value="i965" /> </device> </driconf> The other option is to set an environment variable: export MESA_LOADER_DRIVER_OVERRIDE=i965 For now, "prefer_iris" defaults to i965 (the historical choice). A separate future patch will change the default driver to iris. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893Reviewedby: Eric Engestrom <eric.engestrom@intel.com> Reviewedby: Tapani Pälli <tapani.palli@intel.com> Reviewedby: Eric Anholt <eric@anholt.net>

Fixes: df9f2adf ("turnip: add display wsi") Signedoffby: Jonathan Marek <jonathan@marek.ca> Reviewedby: Eric Anholt <eric@anholt.net>

Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: bcd14756 ('nir/lower_io_to_vector: add flat mode') Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Connor Abbott <cwabbott0@gmail.com>

Improves generated code of dEQPVK.graphicsfuzz.discandaddinfuncinloop because a loop exit phi can then be fixed to exec, removing copies and improving jump threading. No pipelinedb changes. Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

ACO considers discards jumps and creates edges in the CFG for them but NIR does neither of these. This can be fixed instead by keeping track of whether a side of an IF had a break/discard, but this doesn't solve the issue with discards affecting loop exit phis. So this reworks phi handling a bit. Fixes these tests: dEQPVK.graphicsfuzz.discandaddinfuncinloop dEQPVK.graphicsfuzz.loopcalldiscard dEQPVK.graphicsfuzz.complexnestedloopsandcall Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

Right now there are two copies of mm: * mesa/main/mm.[ch] * gallium/auxiliary/util/u_mm.[ch] At some point they splitted, and from the commit message it was not clear why it was not possible to have only one copy at a common place. Taking into account that was several years ago, Im assuming that it was not possible then. This change would allow to have one copy of the same code, and also being able to use that code out of mesa/main or gallium, if needed. This commit moves u_mm and removes mm, as u_mm has slightly more changes. Reviewedby: Jose Fonseca <jfonseca@vmware.com>

Fixes: 13ab63bb ('radv: Implement VK_EXT_buffer_device_address.') Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Stronger ordering is implemented in SPIRV>NIR with barriers. Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Signedoffby: Rhys Perry <pendingchaos02@gmail.com> Reviewedby: Samuel Pitoiset <samuel.pitoiset@gmail.com>

This exposes GL_TDFX_texture_compression_FXT1 support. It's ancient, only Intel GPUs appear to support it, and I seriously doubt anybody uses it. But i965 supports it, and it's trivial to do, so we may as well support it in the new iris driver as well. Reviewedby: Eric Anholt <eric@anholt.net>

Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format unification work. This was really most of the work of implementing the extension. We just need to handle it in a couple of places and expose the extension. v2: Reject the new formats in llvmpipe_is_format_supported to prevent crashes because it doesn't know how to handle the new formats. Reviewedby: Marek Olšák <marek.olsak@amd.com> [v1] Reviewedby: Eric Anholt <eric@anholt.net> [v1]

This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewedby: Marek Olšák <marek.olsak@amd.com>

Moved to RADV. No pipelinedb changes. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Daniel Schürmann <daniel@schuermann.dev>

This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipelinedb (NAVI10/LLVM): SGPRS: 9043 > 9051 (0.09 %) VGPRS: 7272 > 7292 (0.28 %) Code Size: 638892 > 621628 (2.70 %) bytes LDS: 1333 > 1331 (0.15 %) blocks Max Waves: 1614 > 1608 (0.37 %) Found this while glancing at some F12019 shaders. Signedoffby: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewedby: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
