In earlier efforts I've been comparing the performance of the EXA acceleration architecture to the older XAA architecture as well as to having no acceleration enabled in the X server at all.
Some of the results I found were startling and discouraging, (with EXA performing several times slower than even NoAccel in some cases). As I drilled further one obvious question arose: Was I seeing significant performance problems that would affect real-world cases? Or was it just that the synthetic, micro benchmarks in cairo's performance test suite happened to exercise corner cases that wouldn't cause problems in practice.
So before going further with those results, I decided to step back and measure some real-world loads with and without EXA. Thanks to some help from Robert and Vladimir I was able to get Mozilla's Trender benchmark up and running. And thanks to Keith Packard of Intel, I'm now testing on an Intel 965 chip in addition to the old ATI r100 in my laptop that I was using before.
The Trender benchmark measures rendering time for many different real-world web pages, SVG files, and some synthetic loads. Mean times are reported for several different subsets of the tests as well as one mean time over all the tests.
All the details and charts are below, but I'll deliver the punchline here. For the Mozilla Trender benchmark, EXA is almost always a slowdown compared to NoAccel, (for either i965 or r100). And for the i965, XAA is also always a slowdown, and a dramatic slowdown for the SVG case, (which is gearflowers.svg). Interestingly, the SVG case on the i965 is also the one case where EXA is able to match the NoAccel performance.
I haven't tracked the cause of these slowdowns down yet---stay tuned for that---and it's possible that mozilla could perhaps be doing something different to help. But more and more it looks like there are some basic things missing in EXA. Again, hopefully this means there's some low-hanging fruit here that will be easy to optimize.
Here are the configuration details and results for the Intel 965:
- xserver: 0375009a (May 17 commit)
- xf86-video-ati: aea801cf (Apr. 13 commit)
- firefox: 3.0a6pre (June 17 nightly build)
Test | Tbox | TboxGFX | English | Foreign | SVG | ALL |
---|---|---|---|---|---|---|
NoAccel | 21.859 | 44.698 | 12.110 | 41.205 | 474.750 | 24.176 |
XAA | 28.458 | 221.035 | 18.144 | 43.614 | 1075.306 | 32.997 |
EXA | 100.777 | 133.532 | 83.543 | 101.258 | 473.111 | 87.740 |
And here is the same for the r100:
- xserver: 3c982bc1 (May 24 commit)
- xf86-video-intel: d1723445 (May 23 commit)
- firefox: 3.0a6pre (June 17 nightly build)
Test | Tbox | TboxGFX | English | Foreign | SVG | ALL |
---|---|---|---|---|---|---|
NoAccel | 68.891 | 46.772 | 49.668 | 71.574 | 1126.222 | 55.282 |
XAA | 55.757 | 43.344 | 38.190 | 60.322 | 1137.000 | 45.493 |
EXA | 141.928 | 99.445 | 125.808 | 143.801 | 1761.917 | 120.152 |