What EXA gets right already

I've been writing various posts about EXA for a couple of months now. And for the most part, they've been fairly negative, (showing big slowdowns compared to running an X server without acceleration at all, for example).

As I've talked to people that have read the posts, it's clear that I've managed to spread some misconceptions. So let me clear things up now:

The reason my posts have focused on negative performance aspects is because I was looking for things that could be sped up, and as is only appropriate I looked for, found, and have been focused on the biggest performance problem with EXA I could find, (which turns out to be glyph rendering).

So, briefly here, I want to mention a couple of things that EXA is doing a fine job with. The first is the big reason why you don't want to run an X server with NoAccel: scrolling will hurt very badly. Take a look at these rates for a window-to-window copy of a rectangle of various sizes. These results are from "x11perf -copywinwinX" and multiplied by the number of pixels in each operation.

[All tests here are with very recent checkouts of xserver, mesa, and xf86-video-intel. Tests are run on an Intel Core 2 CPU @ 2.13GHz with an Intel 965 graphics card. Thanks, Intel for the donation of hardware for this testing!]

window-copy.png

Window-to-window copy performance (Millions of pixels/sec.)
Rectangle size 10x10 100x100 500x500
NoAccel 14.2 26.5 23.475
XAA 57.8 438 587.5
EXA 77.6 464 587.5

So here we can see that EXA is from 5 to 25 times faster for scrolling windows, depending on the size. And I can assure you that you definitely don't want windows to start scrolling 25 times slower (chug, chug, chug). Meanwhile, EXA is marginally faster than XAA on this test, but not significantly.

Second, let's look at another common operation, filling solid rectangles. This is an essential step in almost any rending, (for clearing to the background color), as well as for actually rendering some content. These results are from "x11perf -rectX", again multiplied by the number of pixels in each operation.

rectangle-fill.png

Solid rectangle fill rate (Millions of pixels/sec.)
Rectangle size 1x1 10x10 100x100 500x500
NoAccel 10.5 99.6 392 662.5
XAA 1.5 90.9 698 842.5
EXA 2.5 250 1150 847.5

Again, EXA outperforms NoAccel here (from 1.3x to 2.9x faster), for all but the tiniest of rectangles. Interestingly, EXA also outperforms XAA by up to 2.7x for the 10x10 rectangle. Also, it's quite interesting to note, (and it's hard to see on the bar chart), that NoAccel outperforms EXA (4.2x) and XAA (7x) for the case of a 1x1 rectangle. Presumably the overhead of setting up hardware rendering for a single-pixel object just plain isn't worth it, (which really shouldn't be that surprising).

So those are a couple of the operations where EXA and XAA are already performing quite well. Some of you will note that Keith Packard has often joked that an X server doesn't need more acceleration than these two operations to perform well. And if you look at the whole set of operations in the XAA interface, indeed you'll find many there that modern applications won't use at all.

But meanwhile, applications are now using the Render extension more and more extensively to draw things. And this is where EXA should afford some acceleration possibilities that XAA does not. And this is also where I've been identifying several problems. If "copy" and "solid fill" are the two most fundamental operations, maybe the next two are "compositing" and "compositing glyphs". I've been talking about problems in those operations for a while, and I plan to start talking about actual solutions soon.

Stay tuned.