• Jason Ekstrand's avatar
    intel/fs/copy-prop: Don't walk all the ACPs for each instruction · f8bda818
    Jason Ekstrand authored
    In order to set up KILL sets, the dataflow code was walking the entire
    array of ACPs for every instruction.  If you assume the number of ACPs
    increases roughly with the number of instructions, this is O(n^2).  As
    it turns out, regions_overlap() is not nearly as cheap as one would like
    and shows up as a significant chunk on perf traces.
    This commit changes things around and instead first builds an array of
    exec_lists which it uses like a hash table (keyed off ACP source or
    destination) similar to what's done in the rest of the copy-prop code.
    By first walking the list of ACPs and populating the table and then
    walking instructions and only looking at ACPs which probably have the
    same VGRF number, we can reduce the complexity to O(n).  This takes the
    execution time of the piglit vs-isnan-dvec test from about 56.4 seconds
    on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to
    about 38.7 seconds.
    Reviewed-by: default avatarKenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: default avatarMatt Turner <mattst88@gmail.com>