Just a thought I wanted to share about rendering lighting into a virtual texture (dynamic surface caching).
Compared to deferred rendering, it should be slower in the worst case.
But the worst case will always be bounded by the size of the page cache texture x the number of lights per page.
The best case is much better because lighting could theoretically be cached per page, as long as the lights that touch that page don't move or change.
However, unless you're moving *a lot* of lights around on screen, dynamic surface caching might always be faster than deferred rendering when rendering the scene multiple times.
Which is what you'd need to do when creating 3D for a 3D TV.
This is because when you're done with rendering the lighting into the page cache texture, you only need to render the scene as if you're rendering the geometry with only one texture, period.
Of course I'm simplifying this a little bit, because in the real world you also have post processing.
But it's still an interesting property ..
Saturday, June 26, 2010
Friday, June 25, 2010
Borders? Borders? I don't need no stinkin' borders (on disk)
Anyone who's read any paper about virtual texturing knows that you need to put borders around your pages, otherwise you get ugly seems because of bilinear interpolation.
This is because your page might be anywhere in your page texture, neighbouring other pages which have completely different colors, and when you sample an interpolated pixel near the border of the page, it interpolates between pixels of unrelated pages.
Usually the borders are build at preprocessing time, where the first 4 pixel horizontal or vertical lines are duplicated from a neighbouring page to serve as a border in the former page and visa versa.
Today I realized that the pages on disk don't need any borders at all, because you will always have all the information available to you to build those borders at upload time..
For example:
In this case, we've got a polygon crossing two pages, which are both visible, and both loaded from disk.
Here we can just copy the border pixels from both pages to the other one.
So what about this case?
Here we've got the same polygon crossing the same two pages, but one isn't loaded yet (or it's visible, in which case we really don't care), and we only have a lower resolution version of that page.
So what are we going to do now?
Well, we simply up-sample the lower resolution page, and then just copy border pixels from that page as before.
And when you think about it, this is actually more correct, because if we used a preprocessed border it would have been be a higher resolution border, and we'd actually get a (very subtle) seem here because the neighbouring page is different!
Bottom line:
Downsides:
This is because your page might be anywhere in your page texture, neighbouring other pages which have completely different colors, and when you sample an interpolated pixel near the border of the page, it interpolates between pixels of unrelated pages.
Usually the borders are build at preprocessing time, where the first 4 pixel horizontal or vertical lines are duplicated from a neighbouring page to serve as a border in the former page and visa versa.
Today I realized that the pages on disk don't need any borders at all, because you will always have all the information available to you to build those borders at upload time..
For example:
In this case, we've got a polygon crossing two pages, which are both visible, and both loaded from disk.
Here we can just copy the border pixels from both pages to the other one.
So what about this case?
Here we've got the same polygon crossing the same two pages, but one isn't loaded yet (or it's visible, in which case we really don't care), and we only have a lower resolution version of that page.
So what are we going to do now?
Well, we simply up-sample the lower resolution page, and then just copy border pixels from that page as before.
And when you think about it, this is actually more correct, because if we used a preprocessed border it would have been be a higher resolution border, and we'd actually get a (very subtle) seem here because the neighbouring page is different!
Bottom line:
- Borders do not need to be stored, less disk space required.
- The most correct resolution border is always in memory anyway.
- Preprocessing is simplified.
Downsides:
- More bookkeeping / work at runtime (but not much).
- Borders of adjacent pages need to be updated when you load in a higher resolution page.
Saturday, June 19, 2010
Virtual texture reflections, random thought
A thought popped up in my head this morning..
I'm wondering if the following scheme would work.
For all reflecting static objects the world would be rendered around it, into a paraboloid environment map.
Actually, only the part of the world facing the camera, so the paraboloid environment map would be aligned with the camera in the opposite direction.
These maps would be relatively small, maybe virtual texture page sized (128 x 128), and rendered from a specified point within the reflected object.
Instead of storing the colors in the environment map, we store the virtual texture texel coordinates, which are always unique.
Using this, we can then bounce between the environment maps a couple of times using the surface normal of the texel and the destination texel in the environment maps.
If a texel doesn't have an environment map, it would just use that texel.
After x bounces, it would just use some default color.
Obviously the texel coordinates would be an approximation, so some blur would have to be applied afterwards, and more blur would be required at grazing angles..
Would this work/look good enough? I don't know.
Would it be fast? Probably not, but I think it would beat ray tracing.
Obviously there would be a lot of cost in the form of vsd, draw calls and fill-rate.
Update:
Whoops, because of reflections you'd be able to see the back of a reflected object, which would require a second environment map to be rendered in the direction of the reflection.
Damn you reflection angles! Damn you to heck.
However, it might be possible to make some sort of simplified reflection graph and figure out which environment maps you'd need to render.
Which one to use and when would be more complicated however.
And it certainly won't help with performance.
Update 2:
Maybe objects could be pre-split into several environment maps, and we could then perform back-face / frustum culling etc. on these maps.
I'm wondering if the following scheme would work.
For all reflecting static objects the world would be rendered around it, into a paraboloid environment map.
Actually, only the part of the world facing the camera, so the paraboloid environment map would be aligned with the camera in the opposite direction.
These maps would be relatively small, maybe virtual texture page sized (128 x 128), and rendered from a specified point within the reflected object.
Instead of storing the colors in the environment map, we store the virtual texture texel coordinates, which are always unique.
Using this, we can then bounce between the environment maps a couple of times using the surface normal of the texel and the destination texel in the environment maps.
If a texel doesn't have an environment map, it would just use that texel.
After x bounces, it would just use some default color.
Obviously the texel coordinates would be an approximation, so some blur would have to be applied afterwards, and more blur would be required at grazing angles..
Would this work/look good enough? I don't know.
Would it be fast? Probably not, but I think it would beat ray tracing.
Obviously there would be a lot of cost in the form of vsd, draw calls and fill-rate.
Update:
Whoops, because of reflections you'd be able to see the back of a reflected object, which would require a second environment map to be rendered in the direction of the reflection.
Damn you reflection angles! Damn you to heck.
However, it might be possible to make some sort of simplified reflection graph and figure out which environment maps you'd need to render.
Which one to use and when would be more complicated however.
And it certainly won't help with performance.
Update 2:
Maybe objects could be pre-split into several environment maps, and we could then perform back-face / frustum culling etc. on these maps.
Thursday, June 17, 2010
Texture compression II
I've made some more progress on the compression code.
I have >6000 128x128 pages, created from Quake 4 textures that I'm using for testing purposes.
Right now the average page size is roughly 2kb.
There's one page of 10kb, and two more that are almost 9kb, other than that none gets higher than 6kb.
I could get more compression out of this, but not without degrading quality beyond where I would like to go.
Average decompression time is 8ms, but I haven't made any attempt at optimizing.
I can compress all the pages combined to 31x their original size.
I'm wondering how id software do their compression though..
According to their latest presentation (id Tech 5 Challenges: From Texture Virtualization to Massive Parallelization) they compress each page to about 2-6kb, but that's diffuse, normal + specular.
But I'm only compressing diffuse here.
I know they're using DCT, they even have some papers published about older versions of their compression routines, but so far I haven't been able to get it down to these sizes without seriously degrading quality.
Perhaps, because of all the real-time lighting etc., it's perfectly fine to really compress textures all the way down to the point where the artefacts are really obvious, simply because they won't be as visible in a scene as it would be in a 2d picture.
Perhaps, I'm nitpicking too much when it comes to quality.
I'm also using SSIM for texture quality assessments now.
It's not perfect, but better than PSNR.
I have >6000 128x128 pages, created from Quake 4 textures that I'm using for testing purposes.
Right now the average page size is roughly 2kb.
There's one page of 10kb, and two more that are almost 9kb, other than that none gets higher than 6kb.
I could get more compression out of this, but not without degrading quality beyond where I would like to go.
Average decompression time is 8ms, but I haven't made any attempt at optimizing.
I can compress all the pages combined to 31x their original size.
I'm wondering how id software do their compression though..
According to their latest presentation (id Tech 5 Challenges: From Texture Virtualization to Massive Parallelization) they compress each page to about 2-6kb, but that's diffuse, normal + specular.
But I'm only compressing diffuse here.
I know they're using DCT, they even have some papers published about older versions of their compression routines, but so far I haven't been able to get it down to these sizes without seriously degrading quality.
Perhaps, because of all the real-time lighting etc., it's perfectly fine to really compress textures all the way down to the point where the artefacts are really obvious, simply because they won't be as visible in a scene as it would be in a 2d picture.
Perhaps, I'm nitpicking too much when it comes to quality.
I'm also using SSIM for texture quality assessments now.
It's not perfect, but better than PSNR.
Monday, June 14, 2010
Texture compression
So the last couple of days I've been playing around with texture compression again, just to see how far I could take it using wavelet compression.
I'm pretty pleased with the results. It's not very fast, but then again I haven't made any attempt to optimizing this yet.
I don't think I can get it as fast as DCT, but I can definitely get far more compression for the same quality.
Just to be able to measure my results more accurately I've created a tool which crawls through my page textures, compresses them, and measures the difference between the compressed/ and the original texture and determines the compressed size.
I should note that the third displayed bitmap in the tool is the exaggerated difference between the two textures.
The cool thing about wavelet compression is that it implicitly has it's mip-map encoded within it.
With virtual texturing you should always permanently cache your lowest resolution mip-maps, which means that if you use wavelet compression that you already have part of the compressed page in memory.
The idea is to use the mip maps that are already in memory to decompress the higher resolution mip maps, which helps decrease the amount of data that needs to be loaded from disk.
(Something I'm not yet doing right now)
Of course this will only be useful if the decompression can be made fast enough.
PS.
Texture page size is 128 x 128, 'diffuse' includes alpha channel here.
PPS.
Noticed that the textures that compress the worst are simple greyish textures with alpha.
It seems that these textures are using pre-multiplied alpha and as such have each color multiplied with their alpha.
This causes lots of variability in the color channels, even in the areas which you can't really see, which hurts compression.
The best solution would be to pre-multiply the alpha at load time, and not pre-multiply it on disk. For now I'm setting all the alpha==0 pixels to the average color, and this seem to work fairly well.
PPPS.
After interpolating the average color with the actual pixel color using the alpha as the interpolation factor, the compression ratio for the worst cases halved!
Obviously this shouldn't be done in production code, but it does show that non pre-multiplied alpha textures compress better than pre-multiplied alpha textures, which means that pre-multiplying the alpha should be done after decompression.
Also, can't help but notice that difference images kind of look like edge detection filters, which might mean sharpening the image with a filter might increase the overall quality of the images (maybe).
I could definitely use some better down/up scaling code for the Co/Cg channels.
(I'm using YCoCg internally as a color representation)
I'm pretty pleased with the results. It's not very fast, but then again I haven't made any attempt to optimizing this yet.
I don't think I can get it as fast as DCT, but I can definitely get far more compression for the same quality.
Just to be able to measure my results more accurately I've created a tool which crawls through my page textures, compresses them, and measures the difference between the compressed/ and the original texture and determines the compressed size.
I should note that the third displayed bitmap in the tool is the exaggerated difference between the two textures.
The cool thing about wavelet compression is that it implicitly has it's mip-map encoded within it.
With virtual texturing you should always permanently cache your lowest resolution mip-maps, which means that if you use wavelet compression that you already have part of the compressed page in memory.
The idea is to use the mip maps that are already in memory to decompress the higher resolution mip maps, which helps decrease the amount of data that needs to be loaded from disk.
(Something I'm not yet doing right now)
Of course this will only be useful if the decompression can be made fast enough.
PS.
Texture page size is 128 x 128, 'diffuse' includes alpha channel here.
PPS.
Noticed that the textures that compress the worst are simple greyish textures with alpha.
It seems that these textures are using pre-multiplied alpha and as such have each color multiplied with their alpha.
This causes lots of variability in the color channels, even in the areas which you can't really see, which hurts compression.
The best solution would be to pre-multiply the alpha at load time, and not pre-multiply it on disk. For now I'm setting all the alpha==0 pixels to the average color, and this seem to work fairly well.
PPPS.
After interpolating the average color with the actual pixel color using the alpha as the interpolation factor, the compression ratio for the worst cases halved!
Obviously this shouldn't be done in production code, but it does show that non pre-multiplied alpha textures compress better than pre-multiplied alpha textures, which means that pre-multiplying the alpha should be done after decompression.
Also, can't help but notice that difference images kind of look like edge detection filters, which might mean sharpening the image with a filter might increase the overall quality of the images (maybe).
I could definitely use some better down/up scaling code for the Co/Cg channels.
(I'm using YCoCg internally as a color representation)
Wednesday, June 2, 2010
Some interesting links...
I've got 2 interesting links to share, both are useful for virtual texturing.
The first one "Wavelet image compression - IƱigo Quilez" is a nice article about wavelet compression, which makes me think all the mip maps of a texture page could be compressed together.
This would mean that lower resolution mips would need to be build up dynamically from several pages, which is not necessarily a bad thing because a lower resolution mip's texture area might not be fully utilized anyway.
However, you don't want to require too many compressed pages to be in memory at the same time when you look at a very low resolution page, so there needs to be some sort of trade of.
The very lowest mips should be pre-cached in memory anyway, so that you always have something to display when you can't load in your pages fast enough.
The other link "Bump Mapping Unparametrized Surfaces on the GPU - Morten S. Mikkelsen" is very interesting.
Bryan McNett actually sums it up:
This can actually be used as a texture compression of sorts, by rendering the generated normal map into the normal page cache texture. The linear filtering won't be an issue there.
Of course this requires per pixel position & per pixel normal, something I would need to do lighting into the page cache anyway.
The first one "Wavelet image compression - IƱigo Quilez" is a nice article about wavelet compression, which makes me think all the mip maps of a texture page could be compressed together.
This would mean that lower resolution mips would need to be build up dynamically from several pages, which is not necessarily a bad thing because a lower resolution mip's texture area might not be fully utilized anyway.
However, you don't want to require too many compressed pages to be in memory at the same time when you look at a very low resolution page, so there needs to be some sort of trade of.
The very lowest mips should be pre-cached in memory anyway, so that you always have something to display when you can't load in your pages fast enough.
The other link "Bump Mapping Unparametrized Surfaces on the GPU - Morten S. Mikkelsen" is very interesting.
Bryan McNett actually sums it up:
The guy who wrote this paper sits five feet away from me. The paper doesn't say so explicitly, but this finally makes normal maps obsolete for games. Implement the paper, and you can replace 2-channel normal maps with 1-channel height maps. You can also throw away all per-vertex tangent/binormal data. Cool stuff, if you ask me!
The primary flaw of the technique is that, since the derivative of the height map is taken with linear-filtering texture sampling hardware, when magnified it looks like "nearest neighbor" filtering. Fixing this requires adding bicubic-filtering to the hardware.
This can actually be used as a texture compression of sorts, by rendering the generated normal map into the normal page cache texture. The linear filtering won't be an issue there.
Of course this requires per pixel position & per pixel normal, something I would need to do lighting into the page cache anyway.
Subscribe to:
Posts (Atom)


