Infinity Engine File Format Hacking Project



Note: I am NOT the maintainer of Infinity Explorer. I cannot answer questions regarding the Infinity Explorer software. Infinity Explorer is written by and maintained by Dmitry Jemerov. To learn more about Infinity Explorer, infexp.sourceforge.net is a good place to start looking. I do NOT maintain any tools for download regarding the Infinity Engine games. TeamBG is probably a good place to look for tools. I will not respond to emails asking for information on the Infinity Explorer or editing tools. This web site is only about information on the file formats.



Site news

2002-06-19Fixed error relating to WMAP files, and erroneous links in todo list. (Thanks to Lloyd Parkes).
2002-06-13Fixed error relating to MOS files. (Thanks to Tristan Dopler).
2002-01-17Started adding "raw" notes that have never been transcribed to html. I'm starting off with a reasonably complete listing of effect types as of BG1.
2000-10-11Minor updates.
2000-10-10Some updates from Dmitry Jemerov to the WMAP and ITM and SPL files. I've also corrected my heinous oversight in the CRE files section, which gave an incorrect item slot ordering. I will fill in the correct info tomorrow, for BG/BG2, and ASAP for Torment/IWD. For now, I have simply removed the incorrect information. Additionally, tomorrow, I will have all of the BG2 formats and format type codes listed here.
2000-10-09Sorry for taking so long to update, gang; life has been busy. Anyway, given that the wait for this update has been so long, it will be a big one. Additionally, I hope to be making several updates over the next couple of weeks, as I uncover details in the areas I'm currently working on -- mostly details on how the BG2 CRE format crams so much more into the old BG1 format. See my notes in the CRE file docs for a hint. The contributors this time around are numerous and varied: Dmitry Jemerov (the usual torrential downfall of random formats and fixes, particulary to the AREA and WMAP formats), Brian Frappier (info on the GAME V2.0 familiars structure, which I am still working on), Aaron O'Neil (BAMC format), Michael Kay (various tidbits and clarifications in the AREA file format), and Paul Victorey (updates to WMAP file format). Included in this update are many BG2 updates, corrections, and info on the new BG2 compressed formats, the BIFC and BAMC and MOSC formats, as well as bits of C code for decompressing them.
2000-08-26Thanks again to Eddy L O Jansson for his continuing work on improving the HTML and stylesheets for this project. By his suggestion, I've created a to do list for the project, which lists things people can be working on if they have the time and inclination to help out. Another recent improvement, by Brian Frappier's suggestion, is the highlighting of changed entries in a different color. Updated entries should appear in green, except for entirely new formats. (This doesn't appear to always work with Netscape... Netscape's support of CSS is actually rather lamentable. Ahh, well...) Finally, I will be out roaming the countryside for a while, since I'm moving back to the USA, and will be busy. I won't be entirely incommunicado, so if you email me with updates, I'll probably find time to merge them in.
2000-08-25Thanks to Dmitry Jemerov for the (EFF format, and to Joost Mans for continuing work on the item categories table.
2000-08-22Thanks to Dmitry Jemerov for updates to several formats (CRE, ITM, and GAME), and to Joost Mans for updates to the item categories table.
2000-08-19Thanks to Grazzt for filling in the new "weapons proficiency" fields in the CRE V9.0 (Icewind Dale creature) files.
2000-08-18First off, there are some gaps filled in by Banelord, most of which I'm still sifting through, which should be added sometime soon. Still working on processing information from Icewind Dale, so expect updates regarding Icewind Dale in the next few days, I hope.
2000-08-14Finally back from apartment hunting. A few updates from Joost Mans, who pointed out that I'd made an error in the palette entry formats. (They are RGBQUADs instead of PALETTEENTRYs). Also a minor update to the SPL format from Joost Mans. Also from Joost Mans, updates to the table of item categories -- specifically, Icewind Dale categories have been added (great sword, etc).
2000-07-23I finished Icewind Dale a few days ago, and so I'm back to work on the project. Dmitry Jemerov has been hacking away like a madman, and has sent in several significant updates to the AREA file format, as well as a few other minor updates. He's also released Infinity Explorer 0.5 with support for Icewind Dale files. Go check it out!
2000-07-05Major bugfixes to several sections sent in by Dmitry Jemerov. Also, Eddy L O Jansson has made a first pass over the documentation to make it validate strict and use CSS. Working on collating this information.
2000-07-02Yumm! Icewind Dale! I haven't got the game just yet, but thanks to Brian Frappier, I've got a sampling of the small data files, so I'm working on the new formats already. Specifically, the format for the new compressed BIFF files (.cbf) files is understood, and others are being examined. On the 5th of July, I should have a copy myself, though, admittedly, I'll probably spend some time playing it, which may preclude productive work on this project for a few days. :)
2000-06-26Since this site covers Planescape: Torment file formats in addition to the Baldur's Gate file formats, it has been renamed to the "Infinity Engine File Format Hacking Project", from its old name, the "Baldur's Gate File Format Hacking Project".
2000-06-20A majority of the structs are here. ITM and SPL info will be posted soon. There are several fields which are still unknown. Thanks to Thom Zakariassen for clarifications regarding the GAME structure.
2000-06-07This page is now being maintained by me (Jed Wing). Not all of the information is posted yet, but within the next 3 or 4 days, I'll put up more and more information.
2000-06-03This page is now extremely outdated. For the latest on BG-hacking check out Infinity Explorer.
2000-01-12Small additions.
1999-09-10First pass of table optimization. Contact info changed. New tools linked.
1999-09-02Petr updated the page in the sections AREA, BAM and ITM.

Introduction

This document is focused on the technical aspects of the game. This is not the place to learn about cheating. If you want to use this information to create tools for cheating, go right ahead, but that is not the purpose of this project.

I (Jed Wing) have now assumed control of and responsibility for this page. I've been working on something similar since Baldur's Gate came out, and only recently became aware of this project. Since this page was no longer being maintained, I offered to take it over and supplant it's findings with my own. I hope that any of you who have relevant information would be willing to help this document become more accurate, as well.

Much credit goes to Eddy L O Jansson, Robert Risberg, and Petr Zahradnik, who compiled much of this information.

Much credit also goes to Dmitry Jemerov, who has been an incredibly prolific source of information.

Finally, a note about what Eddy Jansson described as a "proprietary" attitude towards information people have uncovered on their own. I, too, believe this is a shame, for a number of reasons. Largely, it seems like a tremendous waste. Duplicated effort, many people making the same mistakes, as well as a sort of latent hostility. The only motivation I can see for this is a sort of pre-pubescent braggadocio. Well, whatever. Information wants to be free.

Document Conventions

Each file format will have its own section. It will be represented in a tabular form, as follows:

OffsetSize (datatype)Description
0x00004 (char array)Signature ('FOO ')
0x00044 (char array)Version ('V1 ')
0x00084 (dword)Unknown (Grue count?)

Obviously, many of the fields are derived from guesswork. Beneath each table will be a discussion of the fields which I am not certain about. Fields specified in white are fairly certain to be correct. Fields which are red have not been verified. Fields which are blue are unknown.

The data types which will be referenced in this paper are:

Data typeDescription
char array An array of ASCII characters, fixed in length
ASCIIZ stringAn array of ASCII characters, terminated by a NUL character. Typically, char arrays in this can be NUL-terminated -- i.e. part of the array may be filled with garbage, and this will cause no problem, as long as at least one byte earlier in the array is a NUL character (ASCII 00).
word A little-endian "word" of 16-bits
dword A little-endian "double-word" of 32-bits
point A point within some reference frame, composed of two 16-bit words; the first is the x-coordinate and the second is the y-coordinate.
rect A rectangle within some reference frame, composed of 4 16-bit words; the order of the coordinates is: left, top, right, bottom. Typically this is used to store bounding boxes of various objects.
strref A reference into the 'TLK ' resource -- a 32-bit number which can be mapped to a string via a lookup into the TLK table.
resref A reference to a specific resource -- an 8 character long string which is mapped to a resource via the KEY file and the override directory. Note that these are always 8 characters long, even though any characters after and including a NUL character in the name are ignored.

This document is probably rife with inaccuracies and inadequacies. If you believe parts of this to be in error, or you can help to fill in details of bits which are not yet known, please email me corrections.

When bits are numbered, they will be numbered with the least-significant bit as 0, and the most significant bit as 7, 15, or 31, for byte, word, or dword, respectively.

General

Some internal use is made of the language Lua, developed by TeCGraf at the Pontifical Catholic University of Rio de Janeiro in Brazil. This only applies to Baldur's Gate, as Lua was carefully excised in Planescape: Torment. It is primarily used for the cheats/debugging console, into which, in fact, can be typed complete Lua programs. There is a reference-manual online; it will take you a couple of hours if you read it all, much less if you just browse the highlights; it does not, as far as I've been able to determine, actually play an important role in the internal functioning of the engine, and was likely added as a debugging aid. Note that the AI scripts have nothing to do with Lua.

File Formats

Without further ado, here are the file formats. File formats are identified in 3 different ways. First, most file formats have a numerical code, which we will call the resType. Second, all file formats have an extension, which is used to determine what format the data inside is expected to be. Third, most of the file formats are tagged -- i.e. the first 4 bytes of the file are a (character) code, determining the format, as well as another 4 byte code, determining the version of the format.

resTypeextensionformat tagNotesDescription
N/A.key'KEY 'Directory of resources, their locations, and their types.
N/A.bif'BIFF'Archive containing resources, as indexed by the .KEY file.
N/A.cbf'BIF 'Icewind Dale onlyCompressed archive containing resources, as indexed by the .KEY file.
N/A.tlk'TLK 'Table in which strings (and occasionally, sounds) are looked up by strref
N/A .acm??? Music. Proprietary, and somewhat fiercely guarded by Interplay. Don't expect documentation of this format.
N/A .musN/A Text formatMusic. I know nothing of this format.
0x0001 .bmpN/A Microsoft-endorsed standard for static graphics. This is, as often as not, used for storing palettes, rather than for storing bitmapped graphics. In this case, the file will be a 1x1 pixel image, with a full palette. 4, 8, and 24 bit BMPs are supported, but only uncompressed BMPs are supported.
0x0002 .mve??? I do not know anything about this format. Also guarded by Interplay. Don't expect documentation for this format, either, as I am reluctant to displease Interplay and cause them to request these pages taken down.
0x0004 .wav'WAVC'Sound files used throughout the game are stored in these formats. Note that 'WAVC' and 'RIFF' .wav files are used interchangeably throughout the engine; WAVC is an internal format -- more precisely, an ACM file with a header attached to simplify buffer estimation during file decompression.
0x0004 .wavN/A RIFF wave files. A published format. The Microsoft mmio* routines are not used for WAV reading. As a result, the file must be of the very straight-forward variety. Fortunately, most WAV files are.
0x0005 .wfx??? Possibly some sort of bitmapped graphics? Will advise as more information becomes available
0x0006 .plt'PLT 'A bitmapped graphics format used for paper dolls. I understand this format now, and will be writing it up soon. I believe that it basically consists of interleaved bytes of 'color type' and 'intensity'.
0x03e8.bam'BAM 'Used for animations as well as for multi-frame static graphics, this is a format supporting multiple animation cycles, each containing multiple frames. The GUI uses these extensively, as all the controls (buttons, sliders, etc) have controls represented by these files.
0x03e8.bam'BAMC'Baldur's Gate 2 onlySimple zlib based compression format. Essentially, an entire BAM file is compressed using zlib and a small header prepended.
0x03e9.wed'WED 'Represents the graphics of a region. Some connectivity information appears here, though probably only for clipping purposes -- i.e. we have 2-d maps and tiles, but we need to simulate a 3-d environment. Thus, we store the walls in here so that we know which parts are raised, so that a person walking behind a wall is clipped. Anyway, this file type contains lists of regions (overlays), details for how animated tiles are to be animated, and which tiles change when doors are opened.
0x03ea.chu'CHUI'A representation (a la Windows dialog templates) of GUI elements. Basically, it is a list of 'windows', which may optionally have a .mos as a background, and a list of 'controls' for each window. The controls include 'slider controls', 'text fields', 'buttons', 'scroll bars' and a few other assorted types
0x03eb.tisN/AThis is the tileset information used for painting the screens. A tileset is basically an array of tiles which are composed of a palette of 256 24-bit colors and a block of pixels (typically 64x64) which are to be painted using that palette.
0x03ec.mos'MOS 'Yet another tiled file-format. This is used for backgrounds for gui windows and for the overhead map of regions. It is likely that this is stored in tile format only because of the compression advantages, as opposed to the .tis files which are stored in tiled format because of the pragmatic advantage of being able to load tiles quickly on demand.
0x03ec.mos'MOSC'Compressed format, exactly like BAMC format, except with a different signature field in the header.
0x03ed.itm'ITM 'Objects which may appear in either the player character's inventory, or in various creatures 'inventory' are stored in these files.
0x03ee.spl'SPL 'Spells are stored in this format. This includes wizard spells (spwi*), priest spells (sppr*), and innate spells (spin*), as well as any spells which monsters have which are unavailable to the user.
0x03ef.bcsN/AText formatCompiled script files, as are output by the script compiler.
0x03f0.idsN/AText formatA mapping from numbers to text, typically giving descriptive names or labels to engine internals. For instance the exported functions that can be accessed from scripts are given IDs in one of these files.
0x03f1.cre'CRE 'All the monsters in the game are stored in this format, which associates statistics, graphics, and AI scripts to baddies.
0x03f2.are'AREA'A description of an area, but more schematic than WED files. The AREA file contains descriptions of where containers, doors, actors, and items are in the area.
0x03f3.dlg'DLG ' All inter-character dialog is scripted using these files.
0x03f4.2da'2DA 'Text format.Note: Do not count on the signature being at the beginning of the file. The reason for this is twofold. First, these are text files and may have spaces before the signature. Second, these may be encrypted with a simple XOR key. (Enough to stop a snooper, but not enough to keep a determined intruder out.) A "two-dimensional array" file format which has, in addition to a 2-dim array of strings, column and row headers. Typically used for storing the AD&D rulesets.
0x03f5.gam'GAME'Save game file format -- stores the current state of the party and of the internal variables.
N/A.sav'SAV 'Save game file format -- stores the current state of the areas the party has visited.
0x03f6.sto'STOR'Store file format. Stores information on a stores stock, it's prices, and what it's willing to buy.
0x03f7.wmp'WMAP'World map file format. This stores information on which areas are located where on the world-map, and which graphics to use for them. Work in progress!
0x03f8 .chr'CHR ' Exported player characters are stored in this format, which actually contains a .cre file in its entirety.
0x03f8.eff'EFF 'ToTSC and IWD and BG2The EFF V2.0 format replaces the old 30-byte effect structure found in CRE files and ITM files, and partially documented in the effects section. The EFF V2.0 format can be found either as a standalone file, or, if certain flags are set, in CRE files, and possibly in SPL and ITM files.
0x03f9.bsN/AText formatPrincipally the same as the .bcs file, these are only used for character control scripts. It is likely that they are restricted to a subset of the functions callable from a .bcs
0x03fa .chr'CHR 'Character files. Used to be 0x3f8, but now that's EFF, so... Dunno...
0x03fb .vvc??? Visual 'spell casting' effects are somehow described by these files. it is not known how.
0x03fc .vef???Baldur's Gate 2 onlyVisual effects (possibly OpenGL effects?)
0x03fd .pro??? Description of 'projectile' types
0x03fe .bio???Baldur's Gate 2 onlyStores the edited biography of characters?
0x044c .bah???Baldur's Gate 2 onlyUnknown
N/A .bafN/A Text formatThis is the file format used for scripts for the Infinity Engine, both for character control and, one would assume, for scripted events (although scripted events can call a broader range of functions than character AI scripts). These files compile to either .bs or .bcs files.
0x0802 .iniN/A Text format
Torment and Icewind Dale
This is basically the windows .ini file format. It is focused on storing things like quest information and respawn information for areas.
0x0803.srcN/ATorment onlyThis is a binary file format, though very simple. It is used to determine the text that appears over people's heads on the overland screen.
N/A.toh'TLK 'Icewind Dale only?"Talk Override Header". This is used for overrides to specific entries in the TLK file. There is evidence that it may exist in earlier versions, but I've only seen it used in Icewind Dale, where it is used for the purpose of customizing character biographies. It is used in conjunction with the .tot file.
N/A.totN/AIcewind Dale only?"Talk Override Text". This is used for overrides to specific entries in the TLK file. There is evidence that it may exist in earlier versions, but I've only seen it used in Icewind Dale, where it is used for the purpose of customizing character biographies. It is used in conjunction with the .toh file.

Other miscellaneous bits

Credits

Thanks go first, and foremost, to the teams at Bioware, Black Isle, and Interplay, for the fantastic games they've produced.

Eddy L O Jansson and Robert Risberg were responsible for starting this project and obviously for very major contributions. Eddy Jansson continues to make contributions, including work to make these pages validate strict.

Thanks also to Dmitry Jemerov who, in addition to publishing an excellent resource viewer (under the GNU Public License, no less), has shared his information, and continued to analyze the game data, which has been very useful in filling in gaps and correcting errors.

Thanks to Brian Frappier for supplying some sample data files for examination before I managed to get IWD, and for his continued dialog on issues regarding the project and the Infinity Engine.

Petr Zahradnik has made major contributions.

Many warm thanks to Aaron O'Neil for publishing the source to his GateKeeper editor. It helped us flesh out parts of the CRE structure (especially the spell table), and his analysis of the GAMEV1.1-structure was excellent. Aaron O'Neil was also responsible for originally pointing out the new compressed BAM file format.

Thanks to Thom Zakariassen for GAME struct related clarifications.

Thanks to Joost Mans for many recent clarifications and bug fixes.

Thanks to Grazzt for info on Icewind Dale weapons proficiencies.

Thanks to Banelord and Daelomin for helping to fill in some gaps.

Thanks to Michael Kay for clarifications to the AREA file format.

Thanks to Paul Victorey for a very lucid description of the WMAP format.

And for the sake of completeness, thanks to the folks at Datarescue for producing IDA Pro, which is the best disassembler on the market, and without which this information would be much less complete and much less accurate.