C++0x: New string literals

From Dwarf Wiki
Jump to: navigation, search

For detail description of the feature, please refer to:

http://en.wikipedia.org/wiki/C%2B%2B0x#New_string_literals

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm


Overview

C++0x will support three Unicode encodings: UTF-8, UTF-16 and UTF-32.

char has a size big enough to hold a UTF-8 character and can be used to hold UTF-8 characters.

Two new types are also introduced:

  • char16_t has a size big enough to hold a UTF-16 character and can be used to hold UTF-16 characters.
  • char32_t has a size big enough to hold a UTF-32 character and can be used to hold UTF-32 characters.


Proposed change to DWARF

Purpose

  • New base type debug information entries are required to represent char16_t and char32_t.
  • New debug information is needed to describe the encoding used for the character type.
  • This proposal describes new base type encodings for the UTF8/16/32 encoding.


7.8 Base Type Encodings

DW_ATE_utf 0x10


5.1: Base Type Entries

For example, the C++ type char16_t is represented by a base type entry with a name attribute whose value is "char16_t", an encoding attribute whose value is DW_ATE_utf and a byte size attribute whose value is 2.


Appendix

D.9 UTF character type examples

char16_t chr_a = u'h';
char32_t chr_b = U'h';

1$:  DW_TAG_base_type
         DW_AT_name("char16_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(2)
2$:  DW_TAG_base_type
         DW_AT_name("char32_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(4)
3$:  DW_TAG_variable
         DW_AT_name("chr_a")
         DW_AT_type(reference to 1$)
4$:  DW_TAG_variable
         DW_AT_name("chr_b")
         DW_AT_type(reference to 2$)


Change History

April 20, 2009.

  • Re-format DWARF description in Example section to match current documentation.