-
Marc-André Lureau authored
Instead of polling the fences regularly, have a thread that blocks for a single fence using a separate shared context, then uses eventfd to wake up the main thread when something happens. Inside the guest, glmark2 typicially runs twice as fast with the thread sync. Although in general, the performances seems to be about +30%. The benefits is mostly for CPU-bounds tasks (when main the thread hits 100%) A naive perf stat of the vtest renderer with glmark2 "build" test with a fixed number of frames (500) results in the following stats data: (do not value timing related informations, since the renderer is ran and stopped manually) without thread: 3032.282265 task-clock (msec) # 0.420 CPUs utilized 4,277 context-switches # 0.001 M/sec 102 cpu-migrations # 0.034 K/sec 9,020 page-faults # 0.003 M/sec 7,884,098,254 cycles # 2.600 GHz 4,440,126,451 stalled-cycles-frontend # 56.32% frontend cycles idle <not supported> stalled-cycles-backend 11,024,091,578 instructions # 1.40 insns per cycle # 0.40 stalled # cycles per insn 1,091,831,588 branches # 360.069 M/sec 5,426,846 branch-misses # 0.50% of all branches with thread: 3403.592921 task-clock (msec) # 0.452 CPUs utilized 7,145 context-switches # 0.002 M/sec 410 cpu-migrations # 0.120 K/sec 6,191 page-faults # 0.002 M/sec 7,475,038,064 cycles # 2.196 GHz 4,487,043,071 stalled-cycles-frontend # 60.03% frontend cycles idle <not supported> stalled-cycles-backend 9,925,205,494 instructions # 1.33 insns per cycle # 0.45 stalled # cycles per insn 834,375,503 branches # 245.146 M/sec 4,919,995 branch-misses # 0.59% of all branches Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
89aea798