Home
Portfolio
Home
Home
The Passage
Mandelbrot
DarkForge
Stranded
Programming Title
Michael

April 29, 2011
Michael Schoell

Multithreading is like a hydra, you slay one head and out pops two more. The past week I have spent expanding my code so that my DirectX 9 device can work on Windows XP, because what is the use of writing DirectX9 code if all you support is Vista and Windows 7? However, XP does not support some synchronization functions that I used, those being any that relate to condition variables.

What I ended up with is a system that on initialization, checks if it is running on a XP or a new operating system. If it is XP, one system is used, if it is Windows Vista or 7, the original system is used. This is managed via function pointers to maximize speed and no continuous checks are required.

It took a number of various tests to confirm the new system, since it would not deadlock on my computer (in XP mode) but would on an older computer. More small issues may crop up as my engine becomes more advanced and the CPU and GPU load changes, but for now the hydra is slain.

However, at the same time I am not entirely happy with the way I accomplished the multithreading for XP. While in tests it seems to run about the same in XP and Windows 7 mode, the XP version spin locks in numerous places, waiting for variables to change.

Other work has been done on deferred shading. Currently, I only use it to do world lighting and only have setup the diffuse and the normal textures. Only one world light is currently possible, though that is part of my next tasks. Point lights will be added along with flexible world lighting possibilities. More on this once work has actually been done however.

Programming Title
Michael

April 11, 2011
Michael Schoell

Things have been slow for the past couple of weeks due to graduating from Full Sail University and moving back home. I began playing Dragon Age 2 and Anno 1404, which kept me well distracted for many hours.

However, I have greatly sped up my DirectX project. With 10,000 cubes being rendered without texturing, I had some 60 FPS. With texturing and bilinear filtering, 40 FPS. This idea I was okay with, after all, there is a lot of cubes being textured so it stands to reason that it may take time. However, the FPS did not change when I turned around (thus making the fragment shader unused as pixels fail to be on screen).

So I examined the issue and found DirectX to be making around 13 SetSamplerState warnings per object. If that is true for 10,000 objects, that is 130,000 SetSamplerState warnings. To me, that meant there was probably 130,000 if-statements happening per frame behind the scenes. I needed to fix that.

My first task was to research the problem, to see if there was anything I was doing wrong. As it turns out, the ID3DXEffect uses ID3DXEffectStateManager to stop redundant commands from making it to the video card, something I was assured was more expensive than if a check was made CPU side. However, my system uses a hierarchy to sort objects by their states, making such if-statements mostly redundant. Where the default system will check per BeginPass for a shader, my system would at least only check per change in state (usually a change in texture). In the case of my 10,000 cubes, being rendered with 3 different textures, that happens almost never.

So I threw out the ID3DXEffect structure, and made direct calls to the device to change shader states. In no time I had my project back up and running, free of DirectX bulk, and to my amazement I had not only gotten the 20 FPS I had lost from texturing, but an extra 40 FPS on top of that. I double checked everything I did, even used a secondary program to monitor my FPS, and confirmed that nothing seemed broken. The ID3DXEffect structure seemed to have been slowing me down the entire time.

My work is not over, the ID3DXEffect structure simplified many things with shaders, likely a testament to it's slowdown. Further work will be involved in expanding my code and making it more flexible and robust. Plus, I still have to make it support Windows XP with it's synchronization functions.

Ten Thousand Cubes Textured 10,000 Textured Cubes, Bilinear filtering (120,000 Triangles)
Ten Thousand Spheres Textured 10,000 Textured Spheres, Bilinear filtering (12,270,000 Triangles)
Programming Title
Michael

March 25, 2011
Michael Schoell

My final project is complete and can be downloaded here. I hope to get a full postmortem up for it soon. More information on The Passage can be found on it's page, which will be updated soon.

Programming Title
Michael

March 14, 2011
Michael Schoell

It has been a slow week in my multi-threaded renderer due to final project (The Passage) here at Full Sail University. A number of small issues have cropped up as I have expanded the core system to manage memory and render more cubes.

Most of the issues were synching errors with the worker thread that owns the DirectX device. The issues ranged from deadlocks to corruption of memory, but were easy to solve with some careful examination of code. After making sure I cleared the command buffer after every DirectX release call, things fell into place.

However, another deadlock occurred due to the time the main thread and how long it would take to clean up memory. The worker thread would fall asleep while the main thread cleaned up, and never wake up to know that it was to shut down. Another simple fix, when telling the thread to shutdown, make sure to tell it to wake up.

Multi-threading has been fun, it is on a different level than a single thread, much more complicated and sensitive to mistakes. I have few tools to tell me where each thread was when deadlocked, making the process daunting though generally simple.

As a final note, I did a larger benchmark between single threaded rendering and multi-threaded rendering. My single threaded renderer would bog down at 2500 cubes at 50 FPS while my multi-threaded renderer is going at 60 FPS with 10,000 cubes (Frame 1). While hardly an official benchmark, it does show it's potential especially as I expand into DirectX 11.

Ten Thousand Strong
Figure 1. 10,000 Cubes Rendered.
Programming Title
Michael

February 24, 2011
Michael Schoell

As reported a few days ago, I have been working on multi-threading the DirectX 9 device without the use of the MULTITHREAD flag. My work has paid off in the form of rendering cubes.

To accomplish this task quickly, I reused a lot of my old code from DarkForge. There is no need to write a lot of efficient code and at the end get 1 FPS because my multithreading is done so poorly. So I cut corners to accomplish the task quickly and get an idea of speed differences between single threading and multi-threading.

Before I could get anything tested, I had two issues to deal with. First, in early testing I found that my two threads were out of sync. For a program that syncs at the end of every frame, that was absurd. This proved to be a simple issue, one thread would do something that the other would not at the end of the sync process, causing it to not sync the next frame.

My other issue is more severe and handled in a less than decent way currently. Currently, there are not a lot of sub-systems to handle various data such as vertex buffers, vertex declarations, and shaders. Due to this, the same class that handles the device also manages my DirectX objects. Since DirectX memory must also be cleaned up on the same thread, I just clean it up when the thread is shutting down.

However, this is an issue, since my code is being designed to be able to support another graphics API such as OpenGL or DirectX 11 in the near future. My sub-systems need to be able to shut down without the thread shutting down necessarily. While I do not envision this to be particularly difficult, it is something I must keep in mind for future code.

For testing, I put together a simple scene of 100 cubes laid out in front of the camera, in the exact same way between the single threaded project and the multi-threaded project. The results were more than I could have hoped for. With one thread I got frames of 1600/sec on my computer (AMD 9950 Quad, Nvidia GTX 465, 4 GB RAM), and with two frames I got 2800/sec.

From here I move on to expanding the code base to support what DarkForge currently offers, and to testing the frames per second gains on even more advanced scenes. Another issue that must be overcome is that currently the project does not work in XP computers because of some Windows functions I am using. At the time, I did not realize that I was using Vista and later functions, and this will need fixing. The importance of DirectX 9 is lost when you cannot support Windows XP after all.

Pages

Site Development and Design by <CS>

Graphic Design by Nathan Schoell