C++ Exception Handling
The ISO C++ Standard describes how exceptions which occur while executing C++ code are to be handled. Protected code is contained in a "try" block and code to handle any exception that might occur within this block are contained in a following "catch" block. There may be multiple exception handlers for a try block, each of which handle different kinds of exceptions.
Exceptions are generated by a "throw" statement. This may appear within the try block, but is more commonly found in a subroutine called from within the try block, or one of its descendants. Handling an exception involves returning from each subroutine between where the exception occurs (unwinding the stack) and exiting each scope (including executing any destructors), until the matching handler is found and entered.
There are several ways to implement C++ exception handling. Some methodologies add executable code to save the current program state on entry to a try blocks. Other methodologies generate tables which record the addresses of try blocks and exception handlers.
Other languages, such as Java, handle exceptions in a fashion which is similar to C++, and may use the same implementation, sometimes with minor changes.
The Itanium C++ ABI for IA-64: Exception Handling (no authors identified) describes a method for implementing exception handling (EH) as required by the ISO C++ Standard. Revision 1.22 of this document can be found at http://mentorembedded.github.io/cxx-abi/abi-eh.html. This EH methodology is described for Intel Itanium (IA-64) processor, but has been adopted for use by other processors, including X86, PowerPC, and others.
The Itanium C++ ABI (no authors identified) describes an ABI for C++ programs running on the Itanium (IA-64) Architecture. Revision 1.86 of this document can be found at http://refspecs.linuxfoundation.org/cxxabi-1.86.html. It is not known if this is the most recent or authoritative version of this document. Chapter 4 of this document appears to refer the above document, although the link is broken, as well as to an HP document, also by a broken link, but which may be available at http://mentorembedded.github.io/cxx-abi/exceptions.pdf.
The Linux Standards Base (http://www.linuxfoundation.org/collaborate/workgroups/lsb) published by The Linux Foundation, references the Itanium C++ ABI, Revision 1.86, as a base document.
The System V Application Binary Interface:AMD64 Architecture Processor Supplement, edited by Michael Matz, Jan Hubicka, Andreas Jaeger, and Mark Mitchell, describes an implementation for the AMD64 architecture. Draft version 0.99.5 of this document, dated September 3, 2010, can be found at http://www.docstoc.com/docs/53437568/System-VApplication-Binary-Interface-AMD64-Architecture-Processor. This document may include extensions to the Itanium C++ ABI, but is incomplete. For example, Section 3.7 " Stack Unwind Algorithm" describes some aspects of data encoding but does not describe an algorithm to unwind the stack. This algorithm, as well as other details, are in the C++ ABI document.
Details about the GCC implementation of C++ Exception Handling can be found in blog posts by Ian Lance Taylor:\
The Itanium C++ ABI method uses static data to identify the exception handlers and describe how to unwind the stack frames as required to implement C++ exception handling. This data is saved in the .eh_frame section in the ELF object or executable file.
The format of the data in the ELF .eh_frame section is based on DWARF's Call Frame Information format, with additional information in the CFI augmentation fields. The AMD64 SVR4 ABI Supplement mentions DWARF Version 3, released in 2006. The other documents do not mention any specific version of the DWARF standard. It is unclear to what extent the implementations of this methodology track the current DWARF Standard description of Call Frame Information.
The DWARF Call Frame Information (CFI) is described in Section 6.4 of the DWARF Version 4 Standard. It consists of a Common Information Entry (CIE) data block for each compilation, followed by one or more Frame Definition Entries (FDEs), one for each function in the compilation. This information allows a debugger to walk up the call stack from a given location in the code, identifying the place where each function is called, and perhaps displaying function arguments and data as it was at each call site.
Part of the process of handling an exception is to walk up the stack. It was natural to adapt the similar functionality of the DWARF CFI to handle C++ exception handling. The C++ ABI extends the DWARF CFI by adding an augmentation string to the CIE. Differing descriptions of the augmentation string used by the C++ ABI can be found in the AMD64 SVR4 ABI Supplement and in the blog posts.
Relationship with DWARF
Although the C++ ABI data in the .eh_frame section uses the data format described by the DWARF Standard (with some extensions), this section (and other sections used by exception handling, such as .eh_frame_hdr and .gcc_except_table) are not defined by the DWARF Standard. The DWARF Standard does not describe the extensions to support exception handling nor the routines which must be called by a program to use this data. The DWARF Debugging Format Committee does not specify the contents of these sections or the functionality which must be provided by the language run time system to support exception handling.
The .eh_frame section is not used for debugging. Whether it is generated or not is independent of whether DWARF debug data is generated. All DWARF data is contained in sections with names starting with .debug, which may be removed from a program without affecting the program's normal execution. It is common practice to "strip" debugging sections from a program before putting it into production, either to reduce the program size, make reverse engineering more difficult, or both.
Removing the .eh_frame section (whether the DWARF .debug sections are left in place or not) has a high likelihood of adversely affecting a program's behaviour, especially when it encounters an unexpected condition. Generally, if a compiler does not support generation of DWARF debug data, it will not support generation of .eh_frame data for exception handling. The converse is not true: if a compiler generates DWARF debug data, it may or may not implement the methodology described above.
Unfortunately, it has been a common shorthand to refer to the C++ ABI exception handling methodology using .eh_frame with "DWARF exception handling," or similar phrases. Perhaps this because it is easier to say this than the unwieldy "C++ exception handling using the DWARF Call Frame Information format with extensions", or the misleading "C++ ABI for IA-64" or "SVR4 ABI AMD64 Processor Supplement", especially when discussing a processor other than Itanium or AMD-64. This leads to occasional confusion, where people may look at the DWARF Specification for a description of the C++ ABI exception handling method, or where vulnerabilities in the EH scheme are incorrectly characterized as DWARF vulnerabilities, as in the otherwise excellent paper mentioned below.
Exception Handling Exploits
A 2011 Dartmouth College Technical Report by James Oakley titled "Exploiting the Hard-Working DWARF: Trojan and Exploit Techniques Without Native Executable Code" (available at http://www.cs.dartmouth.edu/cms_file/SYS_techReport/559/TR2011-688.pdf), describes ways in which an attacker can modify a program's exception handling tables to execute arbitrary programs. In brief, a program location in the CFI is described using a DWARF Location Expression which may be a DWARF expression written in DWARF byte code. A malicious attacker who can modify an executable can insert an arbitrary program into the exception handler tables and perform essentially any operation permitted by the run time byte code interpreter.
As the paper mentions, virus and trojan detectors may look through the executable code for the signatures characteristic of these attacks, while overlooking an exploit in the exception handler tables. Additionally, the run time byte code interpreter may permit operations which would not reasonably be performed during handling of an exception.
Unfortunately, the paper confabulates the DWARF debug information, used to communicate between a producer (the compiler) and a consumer (the debugger), with the exception handling tables. There is no producer-consumer relationship in the exception handling data, any more than there is a producer-consumer model in a program which prints "Hello World!" The EH data is required for normal execution of a program, not for debugging the program. Where the paper criticizes DWARF for not limiting stack unwinding (p. 12), or mentions shortcomings or vulnerabilities in the use of DWARF, it misses its target. The target should be the C++ ABI implementation, not the DWARF Debugging Format.
The DWARF Committee would agree with the conclusions in this paper that virus and trojan checkers should inspect the exception handling tables for the kinds of exploits described in the paper. We would also suggest that the run time routines should take steps to prevent the execution of arbitrary location expressions which perform operations which would not normally occur during exception handling. Additional steps may be taken to prevent the undetected modification of the exception handler tables by an attacker.
Debugging Data Exploits
One question which might be asked is whether the DWARF debugging data, contained in .debug sections, is subject to the same exploits as described in the paper. The answer is yes, a malicious attacker could modify the location expressions in the debugging data in a similar fashion. There is, however, a significant caveat.
When running a program under a debugger, the user has extensive control over the operation of the test program. This is not a normal program execution environment, but one in which the program under test is expected to act in ways which were not intended by the programmer. A debugger can change the values of variables used by a program, alter the program flow, and, in many cases, call arbitrary system functions or execute arbitrary code. A virus inserted into a location expression in the debug data would have many of the same abilities.
Running a program under the control of a debugger is inherently a risky activity, which should be performed with at least a minimal awareness that the execution environment for the program is significantly different from and much more permissive than a normal execution environment. Running a program from an unknown or untrusted source under a debugger is inadvisable. Fortunately, the most common mode of using a debugger is to build a program from source in a known environment before running it under a debugger. This eliminates the potential for an attacker to modify the DWARF debug data and insert a virus. Of course, this doesn't prevent second order attacks, where an attacker inserts a virus into a compiler which then inserts the virus into the debugging data of a program it compiles. although this seems to be an unlikely scenario.
-- Michael Eager, June 23, 2013