The PDB File Format

    At the same time, LLVM has a long history of being able to cross-compile fromany platform to any platform, and we wish for the same to be true here. So itis necessary for us to understand the PDB file format at the byte-level so thatwe can generate PDB files entirely on our own.

    This manual describes what we know about the PDB file format today. The layoutof the file, the various streams contained within, the format of individualrecords within, and more.

    Important

    Unless otherwise specified, all numeric values are encoded in little endian.If you see a type such as or uint64_t going forward, alwaysassume it is little endian!

    For more information about the MSF container format, stream directory, andblock layout, see The MSF File Format.

    Streams

    The PDB format contains a number of streams which describe various informationsuch as the types, symbols, source files, and compilands (e.g. object files)of a program, as well as some additional streams containing hash tables that areused by debuggers and other tools to provide fast lookup of records and typesby name, and various other information about how the program was compiled suchas the specific toolchain used, and more. A summary of streams contained in aPDB file is as follows:

    • Information about the PDB Info Stream and how it is used to match PDBs to EXEs.
    • The PDB TPI and IPI Streams
    • Information about the TPI stream and the CodeView records contained within.
    • Information about the DBI stream and relevant substreams including theModule Substreams, source file information, and CodeView symbol recordscontained within.
    • Information about the Module Information Stream, of which there is one foreach compilation unit and the format of symbols contained within.
    • The PDB Public Symbol Stream
    • Information about the Public Symbol Stream.
    • Information about the Global Symbol Stream.
    • Information about the serialized hash table format used internally torepresent things such as the Named Stream Map and the Hash Adjusters in theTPI/IPI Stream.

    CodeView is another format which comes into the picture. While MSF definesthe structure of the overall file, and PDB defines the set of streams thatappear within the MSF file and the format of those streams, CodeView definesthe format of symbol and type records that appear within specific streams.Refer to the pages on and CodeView Type Records formore information about the CodeView format.