Best Practices

From Dwarf Wiki
Jump to: navigation, search

DWARF Best Practices

Compilation Unit Names

For DW_TAG_compilation_unit and DW_TAG_partial_unit DIEs, the name attribute should contain the path name of the primary source file from which the compilation unit was derived (see Section 3.1.1). If the compiler was invoked with a full path name, it is recommended to use the path name as given to the compiler, although it is considered acceptable to convert the path name to an equivalent path where none of the components is a symbolic link. If the compiler was invoked with a relative path name, it is recommended to use the relative path name as given to the compiler; a consumer must be able to locate the source file by combining the compilation directory (see DW_AT_comp_dir) with the relative path name.

Names of Program Entities

For modules, subroutines, variables, parameters, constants, types, and labels, the DW_AT_name attribute should contain the name of the corresponding program object as it appears in the source code, without any qualifiers such as namespaces, containing classes, or modules (see Section 2.15). A consumer can easily reconstruct the fully-qualified name from the DIE hierarchy. In general, the value of DW_AT_name should be such that a fully-qualified name constructed from the DW_AT_name attributes of the object and its containing objects will uniquely represent that object in a form natural to the source language.

For template instantiations, the DW_AT_name attribute should contain both the source language name of the object and the template parameters that distinguish one instantiation from another. The resulting string should be in the natural form for the language, and should have a canonical representation (i.e., different producers should generate the same representation). For C++, the string should match that produced by the target platform's canonical demangler; spaces should only be inserted where syntactically required by the compiler.

The producer may also generate a DW_AT_linkage_name attribute for program objects, but the presence of this attribute should never be required to distinguish one program object from another. The DIE hierarchy is able to provide qualifiers for the name, and the DW_AT_name attribute itself provides template parameters. In the case of overloaded functions, the DW_TAG_formal_parameter DIEs belonging to the function DIE can provide the necessary information to distiguish one overload from another. In many cases, however, it is expensive for a consumer to parse the hierarchy, and the presence of the mangled name may be beneficial to performance. In other cases, the producer may choose to generate a limited subset of debug information, and the mangled name may substitute for the missing information.

Generating .debug_aranges data

Some compilers do not generate a .debug_aranges section for a CU that has no address ranges. However, this means that a reader has no way to distinguish whether aranges is missing because the CU does not have ranges, or because the writer failed to emit it for some reason.

If a compiler generates .debug_aranges sections, it should generate them for all compilation units, even when a compilation unit has no address ranges. This allows a consumer to distinguish between a compilation unit that has no ranges from a compilation unit generated by a compiler that does not generate .debug_aranges sections.

Section names for non-ELF object files

Object file formats other than ELF have been used to contain DWARF sections, but conventions are needed to map the section names to these formats. Here are recommended mappings:

Mach-O

In the Mach-O object file format, debug info sections are located in the __DEBUG segment. By convention, section names start with a ouble underscore instead of the leading dot. Section names are runcated to 16 characters. Since Mach-O linkers do not link the debug information, there is no real use-case for the .dwo sections.

If I had to invent names for the .dwo sections I would replace the debug prefix with dwo. For example:

 1234567890123456             1234567890123456
 .debug_str_offsets.dwo ->    __dwo_str_offset

But, since no compiler implements DWO support for Mach-O (it is solving a problem that does not exist on the platform), I’m unsure whether we should standardize names at this point.

COFF

For COFF, which is used on Windows, the situation is confusing. I’ve been told that the COFF linker truncates section names after 8 characters. Still, apparently the file format can hold more than that and compilers like clang use the full ELF section names and this does not seem to be problem. My best guess is that this works because all of the DWARF 2 section names are unique after 8 characters. Other compilers like Microsoft’s Visual Studio don’t emit DWARF, but use their own CodeView debug file format. I’m unsure how to describe this situation in the text. Maybe we should pass.