1. 06 Jun, 2013 3 commits
  2. 05 Jun, 2013 3 commits
  3. 04 Jun, 2013 10 commits
    • Roland Scheidegger's avatar
      llvmpipe: improve alignment calculation for fetching/storing pixels · 008fd036
      Roland Scheidegger authored
      This was always doing per-pixel alignment which isn't necessary, except
      for the buffer case (due to the per-element offset). The disabled code
      for calculating it was incorrect because it assumed that always the full
      block would be fetched, which may not be the case, so fix this up.
      The original code failed for instance for r10g10b10a2 the alignment would
      have been calculated as 4 (block_width) * 4 (bytes) so 16, but the actual
      fetch may have only fetched 2 values at a time, hence only alignment 8 -
      it is unclear what exactly would happen in this case (alignment larger
      than size to fetch).
      So just use the (already calculated) fetch size instead and get alignment
      from that which should always work, no matter if fetching 1,2 or 4 pixels.
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Roland Scheidegger's avatar
      llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1 · ffe2a1ca
      Roland Scheidegger authored
      For rendering to buffers, we cannot have any y alignment.
      So make sure that tile clear commands only clear up to the fb width/height,
      not more (do this for all resources actually as clearing more seems
      pointless for other resources too). For the jit fs function, skip execution
      of the lower half of the fragment shader for the 4x4 stamp completely,
      for depth/stencil only load/store the values from the first row
      (replace other row with undef).
      For the blend function, also only load half the values from fs output,
      replace the rest with undefs so that everything still operates on the
      full 4x4 block to keep code the same between 4x1 and 4x4 (except for
      load/store of course which also needs to skip (store) or replace these
      values with undefs (load))., at the cost of slightly less optimal code
      being produced in some cases.
      Also reduce 1d and 1d array alignment too, because they can be handled the
      same as buffers so don't need to waste memory.
      v2: don't try to run special blend code for 4x1, (very) slightly less
      complexity if we just use the same code as for 4x4 which may or may not
      make it easier to optimize in the future (as we care a lot more about 4x4
      performance than 1d).
      v2: don't use undef values for unused fs src outputs with llvm 3.1 as it
      apparently can trigger a bug in llvm.
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Roland Scheidegger's avatar
      llvmpipe: cleanup of generate_unswizzled_blend · ef3e8870
      Roland Scheidegger authored
      Some parameters were used inconsistently, for instance not using
      block_width/block_height/block_size for deferring number of pixels
      but rather relying on guesses from the number of fragment shaders etc,
      so fix this up (no actual change in behavior since the block size stays
      fixed). (Though most of the code would work with different block_height,
      with three exceptions, one being the hacked r11g11b10 conversions and
      twiddle code which only work with block_height 2 not 1, and the last
      one being blend vector type not being 128bit wide.)
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Roland Scheidegger's avatar
      gallivm: enhance special sse2 4x4f and 2x8f -> 1x16ub conversion · 44993c18
      Roland Scheidegger authored
      There's no good reason why it can't handle 2x4f->1x8ub, 1x4f->1x4ub and
      1x8f->1x8ub cases, there might be legitimate reasons why we don't have
      enough input vectors for a full destination vector, and using pack
      intrinsics should still be much better than using generic conversion
      (it looks like convert_alpha from the blend code might hit this though
      I suspect it could be avoided).
      v2: add another test vector format to lp_test_conv so this gets tested.
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Roland Scheidegger's avatar
      gallivm: (trivial) fix lp_build_concat_n · ce82523d
      Roland Scheidegger authored
      The code was designed to handle no-op concat but failed (unless the
      caller was using same pointer for src and dst).
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Brian Paul's avatar
      mesa: change MAX_PROGRAM_ADDRESS_REGS to 1, clamp to it in state tracker · f270baf0
      Brian Paul authored
      We've never properly supported more than one address register.  There
      isn't even a field in prog_src_register or prog_dst_register to indicate
      which address register to use if RelAddr!=0.
      In the state tracker, clamp MaxAddressRegs against MAX_PROGRAM_ADDRESS_REGS
      since many gallium drivers do support more.
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65226Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Paul Berry's avatar
      intel: Don't try to blorp or blit CopyTexSubImage(1D_ARRAY). · 2fd785d1
      Paul Berry authored
      Blorp and the hardware blitter can't be used to implement
      CopyTexSubImage when the image type is 1D_ARRAY, because of a
      coordinate system mismatch (the Y coordinate in the source image is
      supposed to be matched up to the Z coordinate in the destination
      The hardware blitter path (intel_copy_texsubimage) contained a perf
      debug warning for this case, but it failed to actually fall back.  The
      blorp path didn't even check.
      Fixes piglit test "copyteximage 1D_ARRAY".
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
    • Paul Berry's avatar
      i965/gen6+: Fix multisample assertions in CopyTexSubImage hw blitter path. · 32d1f423
      Paul Berry authored
      Commit 045612c9 (intel: Add an assert for glCopyTexSubImage() being
      called on MSAA buffers) added an assertion to intel_copy_texsubimage()
      to make sure that multisampling was not in use, based on the
      assumption that glCopyTexSubImage() can't legally be used with
      However, there is one case where glCopyTexSubImage() can legally be
      used with multisampling: when the source buffer is a multisampled
      window system buffer.  If the source and destination color formats
      don't match, the blorp path will fail, so intel_copy_texsubimage()
      will be called.  In this case, we need intel_copy_texsubimage() to
      return false so that we fall back to meta to do the copy.  (The
      multisampled source buffer won't cause a problem for the meta path,
      because it uses glReadPixels, which forces a multisample resolve).
      It's still safe to assert that the destination image is
      single-sampled, because it's not legal to call glCopyTexSubImage() on
      multisampled textures.
      Fixes some failures with piglit tests "copyteximage
      {1D,2D,CUBE,RECT,2D_ARRAY}" (with "samples=..." argument).
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
    • Vinson Lee's avatar
      mesa: Prevent possible out-of-bounds read by save_SamplerParameterfv. · 7bafd88c
      Vinson Lee authored
      Fixes "Out-of-bounds access" defect reported by Coverity.
      Signed-off-by: default avatarVinson Lee <vlee@freedesktop.org>
      Reviewed-by: default avatarBrian Paul <brianp@vmware.com>
    • Dave Airlie's avatar
      i965: fix problem with constant out of bounds access (v3) · 0677ea06
      Dave Airlie authored
      Okay I now understand why Frank would want to run away, this is
      my attempt at fixing the CVE out of bounds access to constants
      outside the range. This attempt converts any illegal constants
      to constant 0 as per the GL spec, and is undefined behaviour.
      A future patch should add some debug for users to find this out,
      but this needs to be backported to stable branches.
      v2: drop the last hunk which was a separate fix (now in master).
      hopefully fix the indentations.
      v3: don't fail piglit, the whole 8/16 dispatch stuff was over
      my head, and I spent a while figuring it out, but this one is
      definitely safe, one piglit pass extra on my Ironlake.
      NOTE: This is a candidate for stable branches.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
  4. 03 Jun, 2013 6 commits
  5. 30 May, 2013 1 commit
  6. 03 Jun, 2013 13 commits
  7. 01 Jun, 2013 4 commits
    • Roland Scheidegger's avatar
      gallium: add support for layered rendering · 6b53e2b0
      Roland Scheidegger authored
      Since pipe_surface already has all the necessary fields no interface
      changes are necessary except adding a new shader semantic value
      (Note that what GL knows as "gl_Layer" variable d3d10 is naming
      v2: drop cap bit (just tied to geometry shader), add docs.
    • Roland Scheidegger's avatar
      gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode · 458a9a0f
      Roland Scheidegger authored
      Surprising this bug survived so long, we were missing a clamp (in the
      linear filtering version).
      (Valgrind complained a lot about invalid reads with piglit texwrap,
      I've also seen spurios failures in this test which might have
      happened due to this. Valgrind probably didn't complain before the
      alignment reduction in llvmpipe to 4x4 since the test is using tiny
      textures so the reads were still always well within allocated area.)
      While here, also do an effective clamp (after half subtraction)
      of [0,length-0.5] instead of [0, length-1] which saves an instruction
      (the filtering weight could be different due to this, but only if
      both texels point to the same max texel so it doesn't matter).
      (Both changes are borrowed from PIPE_TEX_CLAMP_TO_EDGE case.)
      Note: This is a candidate for the stable branches.
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Roland Scheidegger's avatar
      llvmpipe: fix bogus assertions for buffer surfaces · f51fc7a7
      Roland Scheidegger authored
      One of the assertion made no sense for buffer rendertargets
      (due to the union), so drop it. (The same assertion is present already in
      the path for texture surfaces later.).
      v2: make assertion completely accurate (suggested by Jose).
      Reviewed-by: default avatarJose Fonseca <jfonseca@vmware.com>
    • Kenneth Graunke's avatar
      i965: Fix haswell_upload_cut_index when there's no index buffer. · 4405ff40
      Kenneth Graunke authored
      brw->ib.type is reset to -1 at the start of each batch.  If there's no
      index buffer, it won't get updated to a sensible value, resulting in
      _mesa_primitive_restart_index's "Invalid index buffer type" assertion
      Fixes a regression since 7c87a3b5.
      NOTE: This is a candidate for the 9.1 branch (and should be squashed).
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65195Signed-off-by: default avatarKenneth Graunke <kenneth@whitecape.org>