Real-world tests: Mozilla Trender

In earlier efforts I've been comparing the performance of the EXA acceleration architecture to the older XAA architecture as well as to having no acceleration enabled in the X server at all.

Some of the results I found were startling and discouraging, (with EXA performing several times slower than even NoAccel in some cases). As I drilled further one obvious question arose: Was I seeing significant performance problems that would affect real-world cases? Or was it just that the synthetic, micro benchmarks in cairo's performance test suite happened to exercise corner cases that wouldn't cause problems in practice.

So before going further with those results, I decided to step back and measure some real-world loads with and without EXA. Thanks to some help from Robert and Vladimir I was able to get Mozilla's Trender benchmark up and running. And thanks to Keith Packard of Intel, I'm now testing on an Intel 965 chip in addition to the old ATI r100 in my laptop that I was using before.

The Trender benchmark measures rendering time for many different real-world web pages, SVG files, and some synthetic loads. Mean times are reported for several different subsets of the tests as well as one mean time over all the tests.

All the details and charts are below, but I'll deliver the punchline here. For the Mozilla Trender benchmark, EXA is almost always a slowdown compared to NoAccel, (for either i965 or r100). And for the i965, XAA is also always a slowdown, and a dramatic slowdown for the SVG case, (which is gearflowers.svg). Interestingly, the SVG case on the i965 is also the one case where EXA is able to match the NoAccel performance.

I haven't tracked the cause of these slowdowns down yet---stay tuned for that---and it's possible that mozilla could perhaps be doing something different to help. But more and more it looks like there are some basic things missing in EXA. Again, hopefully this means there's some low-hanging fruit here that will be easy to optimize.

Here are the configuration details and results for the Intel 965:

i965.png

Test Tbox TboxGFX English Foreign SVG ALL
NoAccel 21.859 44.698 12.110 41.205 474.750 24.176
XAA 28.458 221.035 18.144 43.614 1075.306 32.997
EXA 100.777 133.532 83.543 101.258 473.111 87.740

And here is the same for the r100:

r100.png

Test Tbox TboxGFX English Foreign SVG ALL
NoAccel 68.891 46.772 49.668 71.574 1126.222 55.282
XAA 55.757 43.344 38.190 60.322 1137.000 45.493
EXA 141.928 99.445 125.808 143.801 1761.917 120.152