Thursday, September 23, 2010

Mr Elusive strikes again!

Mr Elusive, or I should say J.M.P. van Waveren from Id Software, has released a new presentation about virtual texturing called "Using Virtual Texturing to Handle Massive Texture Data".

Most of it has been covered in previous presentations, but there are a couple of interesting tidbits which give some hints on how Id Software's virtual texture implementation works.

  • Their "sparse texture quad tree" is implemented as a MIP mapped texture, just like everybody else. I wondered about this because they never explicitly mentioned it before, and only mentioned a quad tree.
  • The presentation mentions about feedback rendering that "factor 10 smaller is OK" and "~ .5 msec on CPU for 80 x 60" which surprises me, because it's definitely not okay in my quake 4 test levels. But I suppose if the artwork has been made to work with virtual texturing it might actually work.
  • It also mentions that
      diffuse + specular + normal + alpha + power = 10 channels - notice the 10
      • 128k x 128k x 12 channels = 256 GB - now where did that 12 come from??
      • 53 GB DXT compressed (1 x DXT1 + 2 x DXT5)
    • use brute force scene visibility to throw away data - I was wondering about that, I could only think of brute force solutions, I guess the same goes for Id!
      • down to 20 – 50 GB (uncompressed) - It says uncompressed but it's still DXT compressed otherwise the numbers simply don't make any sense
      • 4 – 10 GB DXT compressed - this must be simply 'compressed' then.
So 128x128x12 texels per page = 192kb per page.
192kb / 6 = 21.5x compression. (their worst case compression ratio)

(50 GB / 10 GB = 5x compression, which explains why that 50 GB must be DXT compressed)

It also mentions somewhere "lossy compression is perfectly acceptable", so I'm guessing there are probably a lot of artifacts in their textures, otherwise they won't be able to achieve such high compression ratios.

They also mention that decompressing costs "1 to 2 milliseconds per page on a single CPU core",
which I'm assuming for their entire 12 channel page.

My results are about 3ms per 4 channel texture, which would be 3*3=9ms.
Of course my compression stuff is rather unoptimized, I chose to focus on compression ratio and image quality, so I'm not at all surprised that they have far better results than my part-time experimental efforts.

Another thing is that they specifically mention read back buffers, so their implementation is not analytical after all.
Which makes me wonder about this screenshot ...


... which incidentally uses the exact same screenshot as the previous presentation, because the MIP boundaries always lie on the page boundaries, which is completely not how a normal MIP boundary would look like.
So I'm thinking it's probably just an artificial representation of the pages, not at all representative of their technology.

Afterwards it presents an high level overview of their decompression pipeline, which I haven't had any time yet to look in detail.

Now if you'll excuse me I have to go back to my 2 simultaneous (and impossible) deadlines, thank you.