I have been doing some research into voxel rendering techniques as of lately. While the whole approach has fallen to the wayside a bit with the advent of 3D accelerators, I still find it to be a uniquely elegant approach to modeling and rendering objects in software. As part of my research, I thought it would be interesting to look at how some of these techniques were used in commercial games, back in the days of yore, during the 90s.
One of the games that used voxels extensively was Blade Runner, a game I have previously written about. Westwood studios, the game’s developer, was doing a lot of work with voxels back then, Command & Conquer – Tiberian Sun being another prominent example.
As a first step, I wanted to see if I could get find the voxel model data in the game’s resource files. Like many other Westwood games, Blade Runner uses an in-house format (MIX), which acts as a container for multiple resource files. Looking at the executable at run-time, I saw that the game always first attempts to load a file from disk, and then, only if it cannot find it, get it from one of the MIX files. I’m not sure yet how it decides what MIX file to load the data from; that may just be hard-coded in the executable.
As said before, the MIX format was used by most Westwood games of that period, and has over the years been documented extensively by fans and modders of the games. In essence, each MIX file starts with a header that contains entries for each file in the container. Each file entry consists of a file ID, the offset in the MIX, as well as the file size. The ID is can be computed as a hash of the filename using the following algorithm:
// File is up to 8 characters + 3 characters for the extension function computeHash(filename) filename = toUppercase(filename) filename = pad filename with 0x0 so length is a multiple of 4 hash = 0 for each group of 4 bytes hash = rotateLeft(hash) + group loop return hash
So far so simple – however, that means that by just looking at the MIX file, there is no way to figure out what each container entry was called, or for that matter what type of file it represents. Looking through the resources contained in the game folder, I was nonetheless able to extract some of the original file names. This in turn let me to figure out some of the data types and formats.
Here is a preliminary list of file formats used by Blade Runner:
- VQA files – these are videos in a proprietary format, which has been documented extensively. However, when I tried to load some of the videos that come with Blade Runner, I didn’t have much luck. The game may be using a slightly different version of the file format than the documented cases.
- SHP files – these contain one or more sprites. The SHP format has been used and documented all the way back to the original Command & Conquer, but as with the VQA files, the format used by Blade Runner is slightly different: instead of the more complex compression used in earlier versions, in this iteration the file starts off with a counter of images. Immediately following, each image starts with a header that gives the width and height of the image, as well as the length of the data so that one can find the start of the next image header in the file. The image data itself is uncompressed rows of 16-bit High-Color values.
- SET files – as far as I can tell, these files contain information on the sets / locations seen in the Game. The file contains multiple series of records, the first of which seems to be the list of objects in that location.
- DAT files – these are in fact containers themselves, albeit more primitive than the MIX format. The files start with a header which contains the number of files, as well as the offset of each of them. I have not made much progress understanding the records themselves yet, unfortunately. It should be noted that the game installation directory contains a number of DAT files outside of the MIX files – presumably because of size restrictions of what could be contained in a MIX container.
- GAMEINFO.DAT – despite having a .DAT ending, this file is actually just a collection of resource names. For example, it contains the names of all sets, as well as audio files in the game.
- TRE files – this format is very similar to DAT files, however they miss the initial magic number that identifies a DAT. As far as I can tell, they always contain just string resources. For example, the labels used by the UI are contained in one of the files.
Additionally, there are some other formats contained in the MIX files which I haven’t yet looked into further. Among these are the following:
- AUD – audio files
- TLK – speech files
- FON – fonts
As I was exploring these formats, I created a command line tool which let me browse the contents of the MIX files and load data from individual entries inside the container. The source of the project is available via my Github repository. It’s lacking quite a bit of polish at this time, so that it probably will only prove useful for people willing to dig into the code and make changes as needed; but hey, maybe someone else will be able to contribute some more information on the remaining formats.
That being said, so far I did not find the voxel models that prompted me to start this whole endeavor. At this point, I suspect that the data is stored somewhere in the COREANIM.DAT container, since that is being read by the executable at start-up. Unfortunately, I don’t have any information on the actual file format yet. I’m hoping to post more as I find more time to dig into the data.