Two-Level Line Number Tables
Last updated December 12, 2014
Inlined call information is currently represented in DWARF-4 in the DIE structure, as a DW_TAG_inlined_subroutine DIE representing the inlined function under another DIE representing the calling function. The DW_TAG_inlined_subroutine DIE contains a DW_AT_abstract_origin attribute that refers to a top-level DW_TAG_subprogram DIE that provides the name and declaration coordinates for the inlined function. For each instance of an inlined call, there is one DW_TAG_inlined_subroutine DIE, which contains DW_AT_range (or DW_AT_low_pc/high_pc) attributes that identify the instructions that correspond to the inlined function.
To generate a symbolic backtrace across inline calls using DWARF-4, it is currently necessary to consult the line number table to map an instruction to the file and line number of the inlined subprogram, then consult the DIE tree to identify the particular instruction as part of an inlined subroutine. Once the DW_AT_inlined_subroutine DIE is located, the name of the inlined function and the source coordinates of the point of call can be determined.
Because inlining typically happens during optimization, and optimization tends to schedule code so that instructions for different source statements are heavily interleaved, the range tables for inlined calls can be quite inefficient (in space and lookup time). Requiring a symbolizer to search the DIE tree for inlined function instances also introduces a significant inefficiency, since there is no hint in the line number table that any given instruction does or does not correspond with an inlined function.
On HP-UX, inlined call information is represented directly in the line number tables, using a two-level scheme. The top level, called the logicals table, is a line table with one row for each instance of a logical statement in the program. In this table, each row provides a filename, line number, and recommended breakpoint location. For statements that are part of an inlined function, a “context” column provides a reference to the row representing the point of call, and a “function name” column provides a pointer to a string table entry giving the name of the inlined function. The second level, called the actuals table, is a line table with one row for each machine instruction in the program. In this table, each row provides the instruction address and a reference to the row in the logicals table that describes the statement associated with that instruction. Both tables are encoded by the standard DWARF line number table scheme, with the exception that some of the standard opcodes are defined slightly differently for the two tables.
The Basic Proposal
The basic idea is to split the line number table into two parts: a “logicals” table, and an “actuals” table. The logicals table would contain a row for each logical statement in the program, mapping each statement to a recommended breakpoint location. The actuals table would contain a row for each machine instruction, mapping each instruction to a row in the logicals table. The two tables would reside in the same .debug_line section, share the same header, and use the same encoding. The line number program header would be extended with a pointer to the actuals table, which would follow the logicals table. The actuals table would be optional, and if absent, the logicals table would degrade gracefully to the single-level line number table from DWARF-4.
We also add a subprograms list to the line number program header, following the lists of directories and file names. Like those lists, the subprograms list has a customizable format that allows the producer to provide additional information about each subprogram (e.g., declaration coordinates). Each row in the logicals table may reference an entry in the subprograms list.
The DW_TAG_inlined_subroutine DIEs are still necessary in order for the debugger to identify functions that have been inlined into others, and the range information provided by DW_AT_ranges or DW_AT_low_pc/high_pc may still be useful to debuggers. The DW_AT_call_file, DW_AT_call_line, and DW_AT_call_column attributes may be omitted.
The Logicals Table
The logicals table corresponds most closely to the DWARF-4 line number table, and the DW_AT_stmt_list attribute in the compilation unit DIE points to it. This table is used directly for mapping a source location (“logical statements”) onto a recommended breakpoint location. It contains the following columns:
The “basic_block” and “isa” registers are unused in the logicals table. Because this table represents single locations rather than address ranges, the “end_sequence” register is also unused in the logicals table.
Each row with “is_stmt” true corresponds to a recommended breakpoint location for a source statement. Rows with “is_stmt” false correspond to prologue_end and epilogue_begin points or to logical positions in the source where a non-zero discriminator is required. The table is ordered by address, and rows are implicitly numbered starting from 1.
The “context” column is new, and is used to represent inlined functions. When it is non-zero, the row describes a logical statement that is part of an inlined function, and the value in the “context” column refers to another row number in the logicals table. That row represents the logical statement where the inlined call was made (which may itself be part of another inlined function).
The “subprogram” column is new, and is used to provide information about the subprogram corresponding to the logical statement. When it is non-zero, it refers to an entry in a subprograms list in the line number program header. Entries in the subprograms list provide the name and optional related information, as described by the header. Typically, each entry in the list would provide the file and line number where the subprogram was defined (i.e., the DW_AT_decl_file and DW_AT_decl_line attributes from the subprogram DIE).
When a statement in the source program is replicated in the generated code (e.g., via loop unrolling), each replication is represented with a separate row in the logicals table, so that the debugger can easily find all breakpoint locations for a given statement.
One new standard opcode is needed to set the “context” and “subprogram” registers: DW_LNS_inlined_call takes two unsigned LEB128 numbers as operands. It sets the “context” register to the value of the first operand, and the “subprogram” register to the value of the second operand.
A second new standard opcode, DW_LNS_pop_context, can be used to restore the state machine registers to the values from the logical row referred to by the current value of the “context” register. (This models a return from an inlined call.) This opcode takes no operands.
At the beginning of each sequence, the “context” and “subprogram” registers are both set to zero. The special opcodes and DW_LNS_copy do not alter these two new registers.
The Actuals Table
The actuals table is used to map individual machine instructions to logical statements. On HP-UX, this table is placed in a separate section with its own line number table program header, but we could instead place it in .debug_line immediately following the logicals table, and add another item to the line number program header that provides the offset to the actuals table. The actuals table contains the following columns:
The “logical_row” column is populated from the “line” register. The “file”, “column”, “is_stmt”, “prologue_end”, “epilogue_begin”, and “context” registers are ignored for this table.
Each row in the actuals table corresponds to a machine instruction, and is ordered by address. Where multiple consecutive rows differ only in the address column, only the first row is represented in the table. A row where “basic_block” is true, however, may not be omitted, but if one row has “basic_block” true, and subsequent rows have “basic_block” false, the subsequent rows may be omitted from the table.
When a single machine instruction corresponds to more than one source statement (e.g., due to optimizations such as common subexpression elimination), a separate row for the same address is added to the actuals table for each statement. These consecutive rows are then treated as a single row designating a set of logical statements that are associated with the instruction at that address. (The “logical_row” register and column therefore hold a set of values rather than a single value.) If the rows for subsequent machine instructions are omitted, those subsequent instructions are also associated with the same set of logical statements.
All the existing opcodes that modify the “line” register can be used to set “logical_row”.
One new standard opcode, DW_LNS_set_address_from_logical, can be used to set the “address” register to the value of the “address” column from the logicals table row referred to by the current value of the “line” register. This opcode takes one operand, and works like DW_LNS_advance_line, with the additional side effect of setting the "address" register.
The two-level line table is also a convenient mechanism for supporting additional optimizations such as software pipelining, for supporting accelerated single stepping, and for supporting checkpoint-based debugging.
Support for Software Pipelining
With software pipelining, instructions in a loop are often scheduled one or more iterations ahead (or behind). It can be useful to a debugger (and the user) if we can tag such instructions. To support this, we add a new “iteration” register and column to the actuals table.
When loop prologue code contains instructions generated for a statement within the loop body, the “iteration” register may be set to 1 to indicate that the instruction logically belongs with the first iteration of the loop. Likewise, instructions within the loop body that logically belong to the next iteration would be tagged with an “iteration” of 1. (Higher values can be used when scheduling instructions more than one iteration ahead.) Instructions within the loop body (or epilogue) that logically belong to the previous (or last) iteration can be tagged with an “iteration” of -1. (Lower values can be used when scheduling instructions more than one iteration behind).
One new standard or extended opcode is required: DW_LNS/DW_LNE_set_iteration, which takes an unsigned LEB128 number as an operand. It sets the “iteration” register to the value of its operand. [Because the “column” register is not used in the actuals table, we could alias DW_LNS_set_column and DW_LNS_set_iteration, giving an efficient means of setting the new register without allocating an extra standard opcode.]
At the beginning of each sequence, and whenever a row is added to the actuals table (via a special opcode or by DW_LNS_copy), the “iteration” register is reset to 0. (These are the same rules as for the “discriminator” register.)
Enhanced Support for Single-Stepping
In order to single step by source statement, a debugger typically single steps by machine instructions until the “file” or “line” column changes. In optimized code, this technique can cause the debugger to stop early, and can often cause the debugger to “hop” between two source lines several times. We can improve this by providing enough information for the debugger to determine the set of possible next breakpoint locations, so that it can set temporary breakpoints at each location, then free run until hitting one. To support this, we add a new “next” register and column to the logicals table. This register can hold a set of row numbers (unlike other registers, which only hold a single value). The row numbers in the set refer to other rows in the logicals table, and each such row represents one of the possible breakpoint locations that can be reached after single stepping over the current statement.
Most statements will fall through unconditionally to the next logical statement (i.e., the next row in the logicals table where “is_stmt” is true), and the “next” register can contain a special value for this case. At the beginning of each sequence, and whenever a row is added to the logicals table, the “next” register is set to this special value.
Some statements may have one or more alternate locations that may be reached conditionally, in addition to the fall-through case (e.g., if-then-else and switch statements). For these cases, we provide a new standard opcode, DW_LNS_append_next, which takes an unsigned LEB128 number as its operand. It appends the operand’s value to the “next” register, without removing the previous contents of the set. [The “isa” register is not used in the logicals table, so we could alias DW_LNS_set_isa and DW_LNS_append_next, avoiding the allocation of an extra standard opcode.]
Some statements may never fall through (e.g, goto statements and ends of loops), so we also provide a new standard opcode, DW_LNS_clear_fall_through, which removes the special fall-through value from the “next” register. [The “basic_block” register is not used in the logicals table, so we could alias DW_LNS_set_basic_block and DW_LNS_clear_fall_through.]
When a statement contains one or more non-inlined function calls, the next breakpoint location in the called function(s) is not necessarily in the same logicals table, and cannot be represented in the “next” register. Instead, we add a row to the logicals table for each call instruction, and add those row numbers to the “next” register, in addition to the locations that may be reached by stepping over the call. These new rows that can be reached via “step-into” will have “is_stmt” false, while rows that can be reached via “step-over” will have “is_stmt” true.
When a statement contains one or more inlined function calls, the compiler should add all locations that may be reached via “step-into” or “step-over”. To step over a statement that may have inlined calls, the debugger can simply ignore rows whose “context” value points back to the current row.
Support for Checkpoint-Based Debugging
With the current line number tables, single-stepping or moving from one breakpoint to the next requires that each logical statement have its own address, so that the debugger can execute at least one machine instruction per statement. An alternative approach is simulated execution, where a sequence of statements may all share a common recommended breakpoint location (a “checkpoint”), and single-stepping through the sequence results in no activity in the inferior process. In order to support this kind of debugging scenario, the location lists for each variable modified by the statements in the sequence must be able to select a different DWARF expression based on the logical row number rather than the current PC address. In addition, each logical row must indicate whether a single step from that row can be performed via simulation (i.e., by simply changing the current logical row), or by actually executing instructions in the inferior process.