JavaKazRace - Playable Java racing game demo
PSEmu Pro GPU plug-in
DOSX Utils
SHLight 2004
JavaKazRace DSharingu PSEmuGPU DOSX Utils SHLight 2004

compression

Doing the MegaTexture thing

Davide's picture

I've finally setup a nice decompression thread that takes care of everything (almost) without any hiccups from the rendering side.

Right now I'm only decompressing textures in full size.
When a texture it's loaded, the average color (or 1x1 mip-map) is loaded right away (the 1x1 mip is stored in the file header without any need for decompression). Then a task for decompressing the full size texture is created and the decompression starts right away.
Right now it's either 1x1 or full-size, next I'll have to pick actual intermediate resolutions depending on the image being rendered.

One nice thing that I found out is that DX10 is thread-safe by default (no need to set special flags, making one fear for degraded performance). So, the actual texture creation and decompression is all happening in a thread separate from the main loop.
As I mentioned before, a texture decompressor will unpack data in slices using OpenMP. This allows me to use whatever cores I have to decompress one texture at the time as fast as possible (more cache coherent than trying to decompress different textures at once).
Currently, during load I can see all 4 cores being used to the maximum (more or less).. it's a nice feeling 8)

To minimize initialization times, I've also implemented a simple Direct3D texture object cache.
Every time a texture is released by the engine, it doesn't actually get immediately released in Direct3D but rather added to a free list (unlike the Wikipedia article, I'm actually not using linked-lists) where it can be picked up again if there is another request for a texture with the same characteristics.. otherwise it will be released from D3D after a few frames being unused.
Cache aside, I'm mostly counting on the multi-threading to cover up for all those potential stalls incurring from resource management in D3D.

The next big step will be to continuously "resize" textures depending on the needs of the frame being rendered.
This should happen lazily to avoid too much work by the decompression system.
The reason why one would want to resize textures is to economize on memory, but that doesn't have to happen at every frame for every texture. It makes more sense to keep a texture at 1024x1024 even if the next frame only needs a 512x512, rather than putting to work the decompression. The 1024x0124 may be needed soon again and the rasterizer is going to pick the right mips regardless of the maximum resolution.

In general I like the idea of progressive quality. I like the idea of drawing geometry with whatever texture resolution I have. This way I can theoretically keep a consistent frame rate and the quality of the textures will change depending on how fast the decompression can happen (which depends on the CPU/GPGPU power).

Next next I'll have to worry about geometry.. that's trickier especially when using a lot of complex materials.. but hopefully some coworkers can help there ;)

wooooo
zzzzzzzzzzz

Productivity, generality and OpenMP

Davide's picture

Flash news.. I'm very busy at work 8)

I could work less, but I want to produce something good. I like the idea to take a more general approach to problems and make something bigger out of it.

One of my current goals is to develop in a scalable manner. In order to do that, things need to be rethought in a more generic form.
For example one can have a triangular mesh, or could develop a system to do a remeshing to turn the original geometry into a semi-regular data structure that can easily be compressed and streamed progressively.

I think that scalability is really a key to the much needed productivity improvement in game development.

At work we talk every day about how to go about some solution, and there the key question is always: "can we use a scalable and generic solution ?".
This is usually about development pipeline.. not about actual code. The idea of code reusability is less straightforward. I actually aim more at providing simple implementations, to modularize code so that it can easily be grabbed without too many dependencies.. rather than trying to fit all in a supposed grand scheme of hierarchy of objects and whatnot.

In the end the harder problems are really those about how to organize data and how to transform those data across the development pipeline.

On the side, I also used OpenMP for the first time. After a few odd results, I managed to parallelize a loop that uncompresses images in that progressive-JPEG-like format that I've been working on.
Like for JPEG, the image is processed by 8x8 sub-blocks. Using OpenMP pragmas I set the parallel section to happen on rows of blocks.
Parallelizing every row of blocks makes sense, but I could probably try to do multiple rows at once to see if I can reduce overhead of context switching and potential cache trashing. Parellelizing every block instead turned out to be overkill.
As a rule of thumb, if I think that I could wrap some code into a function with practically no overhead, then perhaps I can make parallel section out of it. In fact, I think that OpenMP eventually grabs that section and makes a function out of it anyway...

Aside from some early decoding artifacts due to my inability to share some variables from outside the "parallel for" (see the example at the bottom here), using OpenMP was really easy. Definitely much simpler than manually creating and reusing threads, also less involved than using Threading Building Blocks because one doesn't need to create functor objects and also because OpenMP is readily available with modern compilers with minimal effort.

cool
zzzzzzzzzz

About Id's Megatextures and modularity

Davide's picture

I found a paper on Intel's site. It's from someone at Id Software.
I think it's pretty close to what's behind the cool MegaTexture name.

..basically a very optimized progressive-JPEG-like (edit: actually streams but it's not progressive in the "progressive JPEG" sense) streaming and decompression in real-time. Something very close to what I'm doing recently (as a task in a project, not as a full blown research).

For my implementation I decided to put an extra effort and make it easy to use outside the main application. This means that I'm going hide all my "nice" support classes and types and only expose the bare minimum for anyone to use the proposed functionality.

I set the goal of making a DLL out of this progressive CODEC, but more DLLs for the future.
The reason for a DLL rather than a LIB is that static linking can be pretty tricky. For example I have a global new and delete overload because I normally need 16/64 bytes aligned memory.
It's nice to be able to use a lot of my common library, but it wouldn't be nice to force headers and symbols onto other potential users.

Thinking in a DLL way also makes it easy to write at least one clean class that is easy to understand and document. The class exported in the DLL header is a bare-bones interface with a pointer to an actual hidden implementation class (aka PIMPL).

...basically I'm talking about modularity ! ..back from the dead.. saving the day where the OOP abuse complicates APIs.

cool

Merry New Year (2008) and Happy Christmas

Davide's picture

I'm alive and well !
Only busy with the non-virtual side of the World. I had a friend from Italy staying in my apt for a few days, then he left and two more friends came... and the celebrations, the parties, etc etc.
I've also got to write some good code before leaving the office. My first vertex animation compression. I used a 4-tap Daubechies wavelet. Haar seemed pretty worthless.. it turns out that Haar doesn't really do a good job at shifting larger values towards the top of the array that is being transformed.

Currently I'm compressing independently x, y and z values. It works nicely. I'm also using some sort of bitplane compression. But not in a hierarchical fashion like it happens with zero-trees.

Compression algorithms are different from 3D graphics, very interesting. It's a shame that most 3D programmers don't give much thought to it.
Actually I think it's in general a shame that there are "3D programmers". I sometimes define myself a 3D programmer for lack of better explanations, but it's really bullshit. Generally it's only the young punks that think that programming 3D graphics is "the thing".
I'm in charge of a (small) team that researches graphics.. but that's really only just the surface. I believe in looking far and wide, rather than getting stuck on APIs, techniques, tricks and tips like those that appear in those recipe books that come out every month (I know I said that before a few time already 8).

I took many pics in the recent days but now comes the time to clean up for this year.. memories are a double edged sword and technology makes it very hard to cut with the past. People can very easily keep in touch, share media evoking old times.. even my robotic dog manages to put me in difficult situations as it sometimes calls the wrong person at the wrong time !!!

geez
Enjoy your holidays 8)

id Software alive and feeding the underdogs (Mac OS X)

Davide's picture

I guess I'm a fan of id Software. Steve Jobs used to snob games but now he invites John Carmack at the Worldwide Developers Conference 2007 ! It's nice to know that id Software keeps supporting the underdogs (the Mac in this case, but also OpenGL from a while ago).  read more »

Syndicate content