Symbol Variants: Why Those Dollar Signs?
Since OS X v10.4, in the system library,
libSystem.dylib, new symbols that contain dollar signs (
$) have been added. This release note explains why they are there, and how a developer might want to take advantage of them.
Software Evolution vs. Backward Compatibility
Software is always changing. New demands require new features to be implemented. Bugs are discovered and fixed. Hardware and lower layers of the OS change, sometimes requiring upper layers to adapt.
For a commercial operating system like OS X, third-party software expects that the system software will always act the same way. This backward compatibility prolongs the users’ investment in software and fosters the notion of greater stability in the system.
The need to evolve the software often conflicts with the desire to provide backward compatibility, but innovative solutions can allow both. If you remember the (classic) Mac OS before OS X, a Classic environment was created specifically to run old applications on OS X. Similarly, the transition to Intel-based hardware spurred the creation of the Rosetta dynamic translation software to provide compatibility with PowerPC-based applications.
Early in the OS X v10.4 timeframe, two major software initiatives required an equally innovative solution to maintain backward compatibility. First, to be certified for UNIX™ conformance, hundreds of system routines needed to be modified. Some changes were considered just bug fixes, but many of the changes required significant change in behavior to the largely BSD-style routines. These changes would surely break existing applications.
Secondly, up until v10.4, there was no support for real (PowerPC) long double floating-point numbers (or more correctly, the long double type was the same size as the regular 64-bit double floating-point type). Support for a real, 128-bit long double type would mean incompatible API changes. Code compiled for one type of long double would crash or produce incorrect results when using routines for the other long double type.
One way to allow routines to behave in new ways for new code but maintain the legacy behavior for previously compiled code is to use symbol versioning. In symbol versioning, different code can have the same symbol name, but have different version numbers. Unfortunately, this would require lots of changes to the compiler, linker and binary file format; a major undertaking that would have to happen before the real work could even be started.
So as an alternative, a feature of the
gcc compiler was used; the
__asm command can be used to rename a symbol. A special suffix was added to the symbol, and by using different suffixes, we can generate families of symbol variants.
That is where the dollar sign comes in. It is used as a separator between the real symbol name and the variant name. Since it can’t occur in normal C code, this avoids the possibility of symbol name collision.
This is what it looks like:
% nm /usr/lib/libSystem.dylib
00086e9f T _fputs
0003a14f T _fputs$UNIX2003
By OS X convention, all variables and routine names are automatically prefixed with an underscore. The
_fputs symbol is the legacy variant, while the
_fputs$UNIX2003 symbol is the new, UNIX™ conforming one. All programs previously built will only know about the
_fputs symbol, and will continue to use it and get legacy behavior, while new code can link to the new
_fputs$UNIX2003 symbol and get UNIX™ conforming behavior.
A symbol may have more than one suffix. For instance:
000a6905 T _ftw
000a6603 T _ftw$INODE64$UNIX2003
000a6dc7 T _ftw$UNIX2003
Prototypes in Header Files
It is in the header files that the real magic occurs. For instance, to generate the
_fputs$UNIX2003 symbol, we would need something like:
int fputs(const char * __restrict, FILE * __restrict) __asm("_fputs$UNIX2003");
<stdio.h>, the actual prototype looks like:
int fputs(const char * __restrict, FILE * __restrict) __DARWIN_ALIAS(fputs);
__DARWIN_ALIAS macro resolves to the necessary
__asm command, as appropriate for the legacy or UNIX™ conforming variant.
Preprocessor Macros Controlling the Variants
The UNIX™ conformance variants use the
Because the 64-bit environment has no legacy to maintain, it was created to be UNIX™ conforming from the start, without the use of the
$UNIX2003 suffix. So, for example,
_fputs$UNIX2003 in 32-bit and
_fputs in 64-bit will have the same conforming behavior.
As of OS X v10.5, UNIX™ conformance is on by default, and newly compiled code will link against the UNIX™ conformance variants, unless overridden with the following five macros.
_XOPEN_SOURCE macros are often set to specify the various levels of standards support. On OS X, only SUSv3 is supported, so the actual value of these macros is not used (but they are reset to appropriate values when necessary).
When either or both of these macros are set, the UNIX™ conforming variants will be used. In addition, unless
_DARWIN_C_SOURCE is also set (see below), these macros will cause the hiding of any variable, routine, structure, etc., in covered header files that are not specified in the standards. (These extra definitions are referred to as extensions to the standards.) Thus, only SUSv3 definitions will be visible in those header files.
_DARWIN_C_SOURCE macro (defined to any value), causes the UNIX™ conforming variants to be used, but does not hide the extensions to the standards, as
_XOPEN_SOURCE do. The
_DARWIN_C_SOURCE macro can be used in conjunction with the
_XOPEN_SOURCE macros, with the
_DARWIN_C_SOURCE behavior overriding the other two, allowing the extensions to the standards to be visible.
In addition, the
_DARWIN_C_SOURCE macro will enable a few other extensions to the standards. These extensions occur where the SUSv3 standard puts additional limitations on the functionality beyond that of legacy (and, typically, BSD) behavior. The extension variants use the
$DARWIN_EXTSN suffix, and can also be enabled with separate macros. (See the macro descriptions below.)
_NONSTD_SOURCE macro can be used to turn off the default UNIX™ conformance, and allow code to be built with legacy behavior. However, this macro will produce a compiler error when any of the above macros are set.
When none of the previous four macros are set, the variants chosen are affected by the environment variable
MACOSX_DEPLOYMENT_TARGET or the
−mmacosx-version-min=... argument passed to the compiler. For example, you might pass
−mmacosx-version-min=10.5 to the compiler or set
MACOSX_DEPLOYMENT_TARGET=10.5 to target OS X v10.5.
If you target version 10.5 or later, the UNIX™ conforming variants are used automatically. If you target version 10.4 or earlier, the legacy variants are used. (There are other side effects as well, such as disabling newer linker features.)
In OS X v10.5 or later, if you do not use this or the previous four macros, the UNIX™ conforming variants are used by default.
In addition, if you target version 10.5 or later (or by default if you do not target a specific version), the OS X v10.5 variants (those with the
$1050 suffix) are used. These variants have significant new behavior that might cause previously compiled programs to misbehave. For example, the (legacy)
_select routine imposed a minimum timeout value of 10 milliseconds; the new
_select$1050 routine has no such minimum.
MACOSX_DEPLOYMENT_TARGET(32-bit PowerPC only)
MACOSX_DEPLOYMENT_TARGET to 10.4 or later (or passing
−mmacosx-version-min=10.4 or later to the compiler) enables 128-bit long double support. Routines that pass long doubles either directly (like
strtold) or indirectly (like
printf and family) have the 128-bit long double variant which uses the
$LDBL128 suffix, while the legacy 64-bit long double variants are unadorned.
What happens when
MACOSX_DEPLOYMENT_TARGET is set to 10.3 or earlier (or is not set at all) is more complicated. During the development of OS X v10.4, it was desired to have 128-bit long double support be the default. However, when
MACOSX_DEPLOYMENT_TARGET is not set, things would default to the behavior three releases before—that of 10.1. While it was possible to back-port the standard C library routines that used long double support to 10.3 (though not the math routines), they couldn’t be back-ported to 10.1 or 10.2.
As a compromise, when
MACOSX_DEPLOYMENT_TARGET is not set, or set to 10.3 or earlier, the header files would use a different variant, with the
$LDBLStub suffix. The compiler would instruct the loader to link against
/usr/lib/libSystemStubs.a, where the
$LDBLStub suffixed symbols are defined. At runtime, these assembly language stubs try to lookup the symbol with the same base name and a
$LDBL128 suffix, and if it finds it, uses it. Otherwise, it will call the unadorned symbol name. This allows the code to adapt to whichever symbols are actually available in the current system library.
On OS X v10.5, not setting
MACOSX_DEPLOYMENT_TARGET and not using
−mmacosx-version-min will result in 10.5 behavior by default, so the
$LDBL128 variants are used instead of the
$LDBLStub variants. However, as before, if
MACOSX_DEPLOYMENT_TARGET is set to 10.3 or earlier, the
$LDBLStub variants are used.
_DARWIN_UNLIMITED_SELECT macro will select the extension variants of
pselect(), which uses the
$DARWIN_EXTSN suffix. The extended versions do not fail if the first argument is greater than
FD_SETSIZE. This was the original BSD behavior.
_DARWIN_BETTER_REALPATH macro selects the extension variant of
realpath(), which uses the
$DARWIN_EXTSN suffix. The extended version uses a fast shortcut to determine the current working directory, but this shortcut does not fail if the parent directories are not readable, as is dictated by the standards.
_DARWIN_USE_64_BIT_INODE macro selects the 64-bit inode variants (those with a
$INODE64 suffix) of routines like
readdir(), which return an
ino_t value. The current default is to return legacy 32-bit
ino_t values, but 64-bit
ino_t values will become the default in the future. Structures containing
ino_t fields, like
struct stat, are larger than the 32-bit
ino_t versions, may have different ordering of the fields (to improve packing efficiency) and may have entirely new fields.
Summary of Variants
Below is a table that summarizes the variant suffixes and the corresponding, controlling and feature test macros.
Controlling Preprocessor Macro
Feature Test Macro
OS X v10.5 and later behavior change
Extended behavior beyond standards
128-bit long double support (32-bit PowerPC only)
(unused; may be removed)
When setting a breakpoint at a routine that has multiple variants,
gdb will set a breakpoint at each variant. The
delete command can be used to remove any unwanted breakpoint at any of the variants. For example:
(gdb) b select Breakpoint 1 at 0x85edd Breakpoint 2 at 0x54592 Breakpoint 3 at 0x4ff50 Breakpoint 4 at 0x37b44 Breakpoint 5 at 0x37b09 warning: Multiple breakpoints were set. Use the "delete" command to delete unwanted breakpoints. (gdb) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00085edd <select+6> 2 breakpoint keep y 0x00054592 <select$UNIX2003+6> 3 breakpoint keep y 0x0004ff50 <select$DARWIN_EXTSN> 4 breakpoint keep y 0x00037b44 <select$DARWIN_EXTSN$NOCANCEL> 5 breakpoint keep y 0x00037b09 <select$NOCANCEL$UNIX2003+6>