History and Process
What is DWARF?
- DWARF is a debugging format used to describe programs in C and other similar programming languages. It is most widely associated with the ELF object format but it has been used with other object file formats.
Why is it called DWARF? And why isn't it spelled "Dwarf"?
- It's a pun, since it was developed along with ELF, the Executable and Linking Format (nee Extensible Linking Format). Brian Russell, the original developer of DWARF, christened it the "Debugging With Attributed Record Formats".
Where did DWARF come from?
- DWARF was orginally developed by Bell Labs for use with the System V debugger named sdb. This format was standardized as DWARF v. 1.0 by the PLSIG (Programming Languages Special Interest Group) of Unix International.
How can I submit a proposal to change or extend the dwarf-specificiation?
- Please see Submitting A Proposal.
Is there a movement from proceedural languages to non-proceedural languages in today's application context?
- No. DWARF is designed to describe compiled procedural languages such as C, C++, Java, Fortran, and similar languages.
What advantages does DWARF have over STABS?
- DWARF is a block structured and extensible description of a program's source and how it is translated into executable code. It's easy to add new descriptions or extend the descriptions in DWARF. STABS is much more restricted in it's expressive abilities. It depends on predefined symbol and type definitions and is not easily modified or extended. Additionally, DWARF has facilities for describing a more complex execution environment, such as discontiguous scopes, stack structures, and stack unwinding, which STABS cannot.
Is there an archive for the previous mailing list hosted by SGI?
- Yes. You can find it here.
Is DWARF associated with XCOFF object format?
- It is reasonably likely someone has used DWARF with XCOFF but
a specific implementation is not known. References to an implementation would be welcome.
What is the Software Licensing Agreement for DWARF?
You write that the DWARF standard (I mean the text) is under GNU FDL, but if you read read GNU FDL, you should have noticed that when something is released under GNU FDL it should be written in something which is widespread or at least readable as plain text.
Dwarf Format Questions
How many DW_TAG_compile_unit entries per Compilation Unit Header?
- Each Compilation Unit Header should be followed by exactly one DW_TAG_compile_unit or one DW_TAG_partial_unit, and the children of the DW_TAG_compile_unit or DW_TAG_partial_unit contain Debugging Information Entries for the unit. A DW_TAG_compile_unit or DW_TAG_partial_unit has no sibling entries.
Why doesn't the line table 'basic block' register have a reset operation?
- It doesn't need one.
The table is based on creating row entries, conceptually a row entry for every pc value in the executable text. All the booleans in the line table, such as is_stmt, basic_block, end_sequence, prologue_end, and epilogue_begin are reset by the creation of a new row in the table (see the individual opcodes that create table rows to see this). Each row in the line table is defined by a sequence of one or more line table opcodes and the opcodes precisely define the value of every column of every row.
How big is a DW_FORM_ref_addr?
- In DWARF3, DW_FORM_ref_addr is clearly defined as being an offset into the .debug_info section so the reference value is the size of an offset. In DWARF2 DW_FORM_ref_addr was (confusingly) defined as being the size of an address on the target machine. The DWARF2 definition never made any sense and was a mistake in the DWARF2 specification: the field DW_FORM_ref_addr defines is an offset, not an address.
Whether producing DWARF2 or DWARF3, please use the DWARF3 definition of DW_FORM_ref_addr.
What is a state machine which is used to decode the byte stream of line and file debug information?
- A state machine is a form of virtual special-purpose computer. The intent
is to make the 'line table' be as compact (on disk) as possible while yet allowing very detailed line positions to be recorded. The state machine 'executes' line table 'instructions' and constructs a 'line table' in a form readily usable by an application (such as a debugger).
What is the basic logic behind the extended, standard and special opcodes?
- The goal is maximum density.
The 'instructions', the opcodes, take as little space as possible yet faithfully represent much detail about the source lines (and how they relate to the object code). Most opcodes are special opcodes. These encode (in a single byte) both the opcode and a machine address and (effectively) a range of source lines. Standard opcodes take a bit more space and represent special information. Extended opcodes take even more space and encode a variable-length instruction.
- This design is effectively a fourth-generation line table.
All generations being designed by one person (with help of course, and over several years). Earlier generations were originally used by MIPS COFF (generation 1) and Borland (generations 2 and 3?).
Code and Technology
Is there a posibility to download a C source code to parse a DWARF-2 file?
Where do you find examples in C and other langs of using the exception handling and other features. Great for programmers to be able to look at working examples.
How can I parse the sections of .debug_info and .debug_abbrev which belong in a file format?
Are there any tools available for editing, compacting, or selectively removing DWARF symbols from object files?
Why does my debugger quit soon as I start debugging?
- Please contact your debugger supplier.
Is there any software that can read DWARF data and output the size and offset of struct fields (and class data members)?
There is lots of information in DWARF and no tool presently does precisely this. Yet there are tools that make it fairly straightforward to get this information. Because C++ class information is complicated by its nature this is not a simple task. All the open-source codes mentioned below have license terms, be sure to understand and obey those terms if you use any code and applications mentioned.
Readelf is a GNU binutils application that can do many things, but one of those things is print DWARF DIEs and attributes as text. A script or program could read this text and find and interpret the desired information. If (instead of just running readelf) you borrow code from readelf you must obey readelf's license terms, of course.
The GNU gdb debugger reads DWARF directly from object files. That code could be adapted. Or gdb could be used itself as a 'backend'. See the gdb MI interface documentation for examples of one way to use gdb as a 'backend'.
Dwarfdump is an application (packaged with libdwarf) that can print DWARF DIEs and attributes as text. A script or program could read this text and find and interpret the desired information.
Libdwarf is a C library API for reading dwarf information (packaged with dwarfdump).
In addition to pretty-printing DWARF it can also be used to query the debug information, print debug info quality metrics, and verify the structural integrity of DWARF debug information. Llvm-dwarfdump is part of the LLVM project.
Where can I find a reference of the Dwarf debugging symbols produced by a GCC compiler on various platforms?
- For information about how GCC or any other compiler implements DWARF, please contact the developer or distributor for that compiler.
Does visual studio .net support the DWARF debugging standard?
What's the best C* DWARF WRITER (API) around?
Is it possible to access local or global variables(i.e. getting their stored values) of a running program without stopping its execution using its debugging information?
- This question is really about operating systems and debuggers and compilers, not so much about DWARF.
A short answer is that it is possible to access global variables in a running program from some debuggers running on some operating systems against applications compiled by some compilers. Whether one can find object information on disk (such as the DWARF information) for a running application also depends on the operating system. In most situations it makes no sense to think about accessing local variables as it's hard to tell at any point when any given local variable is still live: by the time one has finally determined a variable is live it may have vanished or moved.