Assembler Directives
This chapter describes assembler directives (also known as pseudo operations, or pseudo-ops), which allow control over the actions of the assembler.
Directives for Designating the Current Section
The assembler supports designation of arbitrary sections with the .section
and .zerofill
directives (descriptions appear below). Only those sections specified by a directive in the assembly file appear in the resulting object file (including implicit .text
directives—see Built-in Directives. Sections appear in the object file in the order their directives first appear in the assembly file. When object files are linked by the link editor, the output objects have their sections in the order the sections first appear in the object files that are linked. See the ld(1)
OS X man page for more details.
Associated with each section in each segment is an implicit location counter, which begins at zero and is incremented by 1 for each byte assembled into the section. There is no way to explicitly reference a particular location counter, but the directives described here can be used to “activate” the location counter for a section, making it the current location counter. As a result, the assembler begins assembling into the section associated with that location counter.
.section
SYNOPSIS
.section segname , sectname [[[ , type ] , attribute ] , sizeof_stub ] |
The .section
directive causes the assembler to begin assembling into the section given by segname and sectname. A section created with this directive contains initialized data or instructions and is referred to as a content section. type and attribute may be specified as described under Section Types and Attributes. If type is symbol_stubs
, then the sizeof_stub field must be given as the size in bytes of the symbol stubs contained in the section.
.zerofill
SYNOPSIS
.zerofill segname , sectname [ , symbolname , size [ , align_expression ]] |
The .zerofill
directive causes symbolname to be created as uninitialized data in the section given by segname and sectname, with a size in bytes given by size. A power of 2 between 0 and 15 may be given for align_expression to indicate what alignment should be forced on symbolname, which is placed on the next expression boundary having the given alignment. See .align for details.
Section Types and Attributes
A content section has a type, which informs the link editor about special processing needed for the items in that section. The most common form of special processing is for sections containing literals (strings, constants, and so on) where only one copy of the literal is needed in the output file and the same literal can be used by all references in the input files.
A section’s attributes record supplemental information about the section that the link editor may use in processing that section. For example, the pure_instructions
attribute indicates that a section contains only valid machine instructions.
A section’s type and attribute are recorded in a Mach-O file as the flags
field in the section header, using constants defined in the header file mach-o/loader.h
. The following sections describe the various types and attributes by the names used to identify them in a .section
directive. The name of the related constant is also given in parentheses following the identifier.
Type Identifiers
The following sections describe section type identifiers.
regular (S_REGULAR)
A regular
section may contain any kind of data and gets no special processing from the link editor. This is the default section type. Examples of regular
sections include program instructions or initialized data.
cstring_literals (S_CSTRING_LITERALS)
A cstring_literals
section contains null-terminated literal C language character strings. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. The last byte in a section of this type must be a null byte, and the strings can’t contain null bytes in their bodies. An example of a cstring_literals
section is one for the literal strings that appear in the body of an ANSI C function where the compiler chooses to make such strings read only.
4byte_literals (S_4BYTE_LITERALS)
A 4byte_literals
section contains 4-byte literal constants. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. An example of a 4byte_literals
section is one in which single-precision floating-point constants are stored for a RISC machine (these would normally be stored as immediates in CISC machine code).
8byte_literals (S_8BYTE_LITERALS)
An 8byte_literals
section contains 8-byte literal constants. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. An example of a 8byte_literals
section is one in which double-precision floating-point constants are stored for a RISC machine (these would normally be stored as immediates in CISC machine code).
literal_pointers (S_LITERAL_POINTERS)
A literal_pointers
section contains 4-byte pointers to literals in a literal section. The link editor places only one copy of a pointer into the output file’s section for each pointer to a literal with the same contents. The link editor also relocates references to each literal pointer to the one copy in the output file. There must be exactly one relocation entry for each literal pointer in this section, and all references to literals in this section must be inside the address range for the specific literal being referenced. The relocation entries can be external relocation entries referring to undefined symbols if those symbols identify literals in another object file. An example of a literal_pointers
section is one containing selector references generated by the Objective-C compiler.
symbol_stubs (S_SYMBOL_STUBS)
A symbol_stubs
section contains symbol stubs, which are sequences of machine instructions (all the same size) used for lazily binding undefined function calls at runtime. If a call to an undefined function is made, the compiler outputs a call to a symbol stub instead, and tags the stub with an indirect symbol that indicates what symbol the stub is for. On transfer to a symbol stub, a program executes instructions that eventually reach the code for the indirect symbol associated with that stub. Here’s a sample of assembly code based on a function func()
containing only a call to the undefined function foo()
:
.text |
.align 2 |
.globl _func |
_func: |
b L_foo$stub |
.symbol_stub |
L_foo$stub: ; |
.indirect_symbol _foo ; |
lis r11,ha16(L_foo$lazy_ptr) ; |
lwz r12,lo16(L_foo$lazy_ptr)(r11) ; the symbol stub |
mtctr r12 ; |
addi r11,r11,lo16(L_foo$lazy_ptr) ; |
bctr ; |
.lazy_symbol_pointer |
L_foo$lazy_ptr: ; |
.indirect_symbol _foo ; the symbol pointer |
.long dyld_stub_binding_helper ; to be replaced by _foo's address |
The symbol-stub sections in the IA-32 architecture—instead of using a stub and a lazy pointer—use one branch instruction that specifies the target. This is the corresponding IA-32 assembly code:
.text |
.align 2 |
.globl _func |
_func: |
pushl %ebp |
movl %esp, %ebp |
subl $8, %esp |
call L_foo$stub |
leave |
ret |
.symbol_stub |
L_foo$stub: |
.indirect_symbol _foo |
hlt ; hlt ; hlt ; hlt ; hlt |
In the assembly code, _func
branches to L_foo$stub
, which is responsible for finding the definition of the function foo()
. On PPC (and PPC64), L_foo$stub
jumps to the contents of L_foo$lazy_ptr
. This value is initially the address of the dyld_stub_binding_helper
code, which after executing causes it to overwrite the contents of L_foo$lazy_ptr
with the address of the real function, _foo
, and jump to _foo
.
On IA-32, the branch instruction points to the dynamic linker. The first time the stub is called, the dynamic linker modifies the instruction so that it jumps to the real function in subsequent calls.
The indirect symbol entries for _foo
provide information to the static and dynamic linkers for binding the symbol stub. Each symbol stub and lazy pointer entry must have exactly one such indirect symbol, associated with the first address in the stub or pointer entry. See .indirect_symbol for more information.
The static link editor places only one copy of each stub into the output file’s section for a particular indirect symbol, and relocates all references to the stubs with the same indirect symbol to the stub in the output file. Further, the static link editor eliminates a stub if it determines that the target is in the same linkage unit and doesn’t need redirecting at runtime. No global symbols can be defined in symbol_stubs
sections.
On PPC, the stub can refer only to itself, one lazy symbol pointer (referring to the same indirect symbol as the stub), and the dyld_stub_binding_helper()
function.
lazy_symbol_pointers (S_LAZY_SYMBOL_POINTERS)
A lazy_symbol_pointers
section contains 4-byte symbol pointers that eventually contain the value of the indirect symbol associated with the pointer. These pointers are used by symbol stubs to lazily bind undefined function calls at runtime. A lazy symbol pointer initially contains an address in the symbol stub of instructions that cause the symbol pointer to be bound to the function definition (in the example in symbol_stubs (S_SYMBOL_STUBS), the lazy pointer L_foo$lazy_ptr
initially contains the address for dyld_stub_binding_helper
but gets overwritten with the address for _foo
). The dynamic link editor binds the indirect symbol associated with the lazy symbol pointer by overwriting it with the value of the symbol.
The static link editor places a copy of a lazy pointer in the output file only if the corresponding symbol stub is in the output file. Only the corresponding symbol stub can make a reference to a lazy symbol pointer, and no global symbols can be defined in this type of section. There must be one indirect symbol associated with each lazy symbol pointer. An example of a lazy_symbol_pointers
section is one in which the compiler has generated calls to undefined functions, each of which can be bound lazily at the time of the first call to the function.
non_lazy_symbol_pointers (S_NON_LAZY_SYMBOL_POINTERS)
A non_lazy_symbol_pointers
section contains 4-byte symbol pointers that contain the value of the indirect symbol associated with a pointer that may be set at any time before any code makes a reference to it. These pointers are used by the code to reference undefined symbols. Initially these pointers have no interesting value but get overwritten by the dynamic link editor with the value of the symbol for the associated indirect symbol before any code can make a reference to it.
The static link editor places only one copy of each non-lazy pointer for its indirect symbol into the output file and relocates all references to the pointer with the same indirect symbol to the pointer in the output file. The static link editor further can fill in the pointer with the value of the symbol if a definition of the indirect symbol for that pointer is present in the output file. No global symbols can be defined in this type of section. There must be one indirect symbol associated with each non-lazy symbol pointer. An example of a non_lazy_symbol_pointers
section is one in which the compiler has generated code to indirectly reference undefined symbols to be bound at runtime—this preserves the sharing of the machine instructions by allowing the dynamic link editor to update references without writing on the instructions.
Here's an example of assembly code referencing an element in the undefined structure. The corresponding C code would be:
struct s { |
int member1, member2; |
}; |
extern struct s bar; |
int func() |
{ |
return(bar.member2); |
} |
The PowerPC assembly code might look like this:
.text |
.align 2 |
.globl _func |
_func: |
lis r3,ha16(L_bar$non_lazy_ptr) |
lwz r2,lo16(L_bar$non_lazy_ptr)(r3) |
lwz r3,4(r2) |
blr |
.non_lazy_symbol_pointer |
L_bar$non_lazy_ptr: |
.indirect_symbol _bar |
.long 0 |
mod_init_funcs (S_MOD_INIT_FUNC_POINTERS)
A mod_init_funcs
section contains 4-byte pointers to functions that are to be called just after the module containing the pointer is bound into the program by the dynamic link editor. The static link editor does no special processing for this section type except for disallowing section ordering. This is done to maintain the order the functions are called (which is the order their pointers appear in the original module). There must be exactly one relocation entry for each pointer in this section. An example of a mod_init_funcs
section is one in which the compiler has generated code to call C++ constructors for modules that get dynamically bound at runtime.
mod_term_funcs (S_MOD_TERM_FUNC_POINTERS)
A mod_term_funcs
section contains 4-byte pointers to functions that are to be called just before the module containing the pointer is unloaded by the dynamic link editor or the program is terminated. The static link editor does no special processing for this section type except for disallowing section ordering. This is done to maintain the order the functions are called (which is the order their pointers appear in the original module). There must be exactly one relocation entry for each pointer in this section. An example of a mod_term_funcs
section is one in which the compiler has generated code to call C++ destructors for modules that get dynamically bound at runtime.
coalesced (S_COALESCED)
A coalesced
section can contain any instructions or data and is used when more than one definition of a symbol could be defined in multiple object files being linked together. The static link editor keeps the data associated with the coalesced symbol from the first object file it links and silently discards the data from other object files. An example of a coalesced
section is one in which the compiler has generated code for implicit instantiations of C++ templates.
Attribute Identifiers
The following sections describe attribute identifiers.
none (0)
No attributes for this section. This is the default section attribute.
S_ATTR_SOME_INSTRUCTIONS
This attribute is set by the assembler whenever it assembles a machine instruction in a section. There is no directive associated with it, since you cannot set it yourself. It is used by the dynamic link editor together with S_ATTR_EXT_RELOC
and S_ATTR_LOC_RELOC
, set by the static link editor, to know it must flush the cache and other processor-related functions when it relocates instructions by writing on them.
no_dead_strip (S_ATTR_NO_DEAD_STRIP)
The no_dead_strip
section attribute specifies that a particular section must not be dead-stripped. See Directives for Dead-Code Stripping for more information.
no_toc (S_ATTR_NO_TOC)
The no_toc
section attribute means that the global symbols in this section are not to be used in the table of contents of a static library as produced by the program ranlib(1)
. This is normally used with a coalesced
section when it is expected that each object file has a definition of the symbols that it uses.
live_support (S_ATTR_LIVE_SUPPORT)
The live_support
section attribute specifies that a section’s blocks must not be dead-stripped if they reference code that is live, but the reference is undetectable. See Directives for Dead-Code Stripping for more information.
pure_instructions (S_ATTR_PURE_INSTRUCTIONS)
The pure_instructions
attribute means that this section contains nothing but machine instructions. This attribute would be used for the (__TEXT,__text
) section of OS X compilers and sections that have a section type of symbol_stubs
.
strip_static_syms (S_ATTR_STRIP_STATIC_SYMS)
The strip_static_syms
section attribute means that the static symbols in this section can be stripped from linked images that are used with the dynamic linker when debugging symbols are also stripped. This is normally used with a coalesced
section that has private extern
symbols, so that after linking and the private extern symbols have been turned into static symbols they can be stripped to save space in the linked image.
self_modifying_code (S_ATTR_SELF_MODIFYING_CODE)
The self_modifying_code
section attribute identifies a section with code that can be modified by the dynamic linker. For example, IA-32 symbol stubs are implemented as branch instructions that initially point to the dynamic linker but are modified by the dynamic linker to point to the real symbol.
Built-in Directives
The directives described here are simply built-in equivalents for .section
directives with specific arguments.
Designating Sections in the __TEXT Segment
The directives listed below cause the assembler to begin assembling into the indicated section of the __TEXT
segment. Note that the underscore before __TEXT, __text
, and the rest of the segment names is actually two underscore characters.
Directive | Section |
---|---|
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( | |
( |
The following sections describe the sections in the __TEXT
segment and the types of information that should be assembled into each of them.
.text
This is equivalent to .section __TEXT,__text,regular,pure_instructions
when the default -dynamic
flag is in effect and equivalent to .section __TEXT,__text,regular
when the -static
flag is specified.
The compiler places only machine instructions in the (__TEXT,__text
) section (no read-only data, jump tables or anything else). With this, the entire (__TEXT,__text
) section is pure instructions and tools that operate on object files. The runtime can take advantage of this to locate the instructions of the program and not get confused with data that could have been mixed in. To make this work, all runtime support code linked into the program must also obey this rule (all OS X library code follows this rule).
.const
This is equivalent to .section __TEXT,__const
The compiler places all data declared const
and all jump tables it generates for switch statements in this section.
.static_const
This is equivalent to .section __TEXT,__static_const
This is not currently used by the compiler. It was added to the assembler so that the compiler may separate global and static const
data into separate sections if it wished to.
.cstring
This is equivalent to .section __TEXT,__cstring, cstring_literals
This section is marked with the section type cstring_literals
, which the link editor recognizes. The link editor merges the like literal C strings in all the input object files to one unique C string in the output file. Therefore this section must contain only C strings (a C string in a sequence of bytes that ends in a null byte, \0
, and does not contain any other null bytes except its terminator). The compiler places literal C strings found in the code that are not initializers and do not contain any embedded nulls in this section.
.literal4
This is equivalent to .section __TEXT,__literal4,4byte_literals
This section is marked with the section type 4byte_literals
, which the link editor recognizes. The link editor can then merge the like 4 byte literals in all the input object files to one unique 4 byte literal in the output file. Therefore, this section must contain only 4 byte literals. This is typically intended for single precision floating-point constants and the compiler uses this section for that purpose. On some architectures it is more efficient to place these constants in line as immediates as part of the instruction.
.literal8
This is equivalent to .section __TEXT,__literal8,8byte_literals
This section is marked with the section type 8byte_literals
, which the link editor recognizes. The link editor then can merge the like 8 byte literals in all the input object files to one unique 8 byte literal in the output file. Therefore, this section must contain only 8 byte literals. This is typically intended for double precision floating-point constants and the compiler uses this section for that purpose. On some architectures it is more efficient to place these constants in line as immediates as part of the instruction.
.literal16
This is equivalent to .section __TEXT,__literal16,16byte_literals
This section is marked with the section type 16byte_literals
, which the link editor recognizes. The link editor can then merge the like 16 byte literals in all the input object files to one unique 16 byte literal in the output file. Therefore, this section must contain only 16 byte literals. This is typically intended for vector constants and the compiler uses this section for that purpose.
.constructor
This is equivalent to .section __TEXT,__constructor
.destructor
This is equivalent to .section __TEXT,__destructor
The .constructor
and .destructor
sections are used by the C++ runtime system, and are reserved exclusively for the C++ compiler.
.fvmlib_init0
This is equivalent to .section __TEXT,__fvmlib_init0
.fvmlib_init1
This is equivalent to .section __TEXT,__fvmlib_init1
The .fvmlib_init0
and .fvmlib_init1
sections are used by the obsolete fixed virtual memory shared library initialization. The compiler doesn't place anything in these sections, as they are reserved exclusively for the obsolete shared library mechanism.
.symbol_stub
This section is of type symbol_stubs
and has the attribute pure_instructions
. The compiler places symbol stubs in this section for undefined functions that are called in the module. This is the standard symbol stub section for nonposition-independent code.
Symbol stubs are implemented differently on PPC (and PPC64) and on IA-32. The following sections describe each implementation.
PowerPC .symbol_stub
On PowerPC (PPC and PPC64), .symbol_stub
is equivalent to .section __TEXT,__symbol_stub1, symbol_stubs, pure_instructions, 20
.
The standard symbol stub on PPC and PPC64 is 20 bytes and has an alignment of 4 bytes (.align 2
). For example, a stub for the symbol _foo
would be (using a lazy symbol pointer L_foo$lazy_ptr
):
.symbol_stub |
L_foo$stub: |
.indirect_symbol _foo |
lis r11,ha16(L_foo$lazy_ptr) |
lwz r12,lo16(L_foo$lazy_ptr)(r11) |
mtctr r12 |
addi r11,r11,lo16(L_foo$lazy_ptr) |
bctr |
.lazy_symbol_pointer |
L_foo$lazy_ptr: |
.indirect_symbol _foo |
.long dyld_stub_binding_helper |
IA-32 symbol stubs
On IA-32, symbol stubs directly use .section __IMPORT,__jump_table, symbol_stubs, self_modifying_code + pure_instructions, 5
. The built-in directive .symbol_stub
is no longer used.
On IA-32 this section has an additional attribute, self_modifying_code
, which specifies that the code in this section can be modified at runtime. At runtime, the dynamic linker uses this feature in IA-32 stubs to change the branch instruction in the stub so that it jumps to the real symbol instead of their initial target, the dynamic linker itself. This is an example of a symbol stub of the _foo
symbol:
.section __IMPORT,__jump_table,symbol_stubs,self_modifying_code+pure_instructions,5 |
L_foo$stub: |
.indirect_symbol _foo |
hlt ; hlt ; hlt ; hlt ; hlt |
.picsymbol_stub
In PowerPC, this directive translates to .section __TEXT, __picsymbolstub1, symbol_stubs, pure_instructions, NBYTES
.
This section is of type symbol_stubs
and has the attribute pure_instructions
. The compiler places symbol stubs in this section for undefined functions that are called in the module. This is the standard symbol stub section for position-independent code. The value of NBYTES
is dependent on the target architecture.
The standard position-independent symbol stub for the PowerPC is 36 bytes and has an alignment of 4 bytes (.align 2
). For example, a stub for the
symbol
_foo
would be (using a lazy symbol pointer L_foo$lazy_ptr
):
.picsymbol_stub |
L_foo$stub: |
.indirect_symbol _foo |
mflr r0 |
bcl 20,31,L0$_foo |
L0$_foo: |
mflr r11 |
addis r11,r11,ha16(L_foo$lazy_ptr - L0$_foo) |
mtlr r0 |
lwz r12,lo16(L_foo$lazy_ptr - L0$_foo)(r11) |
mtctr r12 |
addi r11,r11,lo16(L_foo$lazy_ptr - L0$_foo) |
bctr |
Designating Sections in the __DATA Segment
These directives cause the assembler to begin assembling into the indicated section of the __DATA
segment:
Directive | Section |
---|---|
( | |
( | |
( | |
( | |
| ( |
( | |
( | |
| ( |
The following sections describe the sections in the __DATA
segment and the types of information that should be assembled into each of them.
.data
This is equivalent to .section __DATA, __data
The compiler places all non-const
initialized data (even initialized to zero) in this section.
.static_data
This is equivalent to .section __DATA, __static_data
This is not currently used by the compiler. It was added to the assembler so that the compiler could separate global and static data symbol into separate sections if it wished to.
.const_data
This is equivalent to .section __DATA, __const, regular
.
This section is of type regular
and has no attributes. This section is used when dynamic code is being compiled for const
data that must be initialized.
.lazy_symbol_pointer
This is equivalent to .section __DATA, __la_symbol_ptr,lazy_symbol_pointers
This section is of type lazy_symbol_pointers
and has no attributes. The compiler places a lazy symbol pointer in this section for each symbol stub it creates for undefined functions that are called in the module. (See .symbol_stub for examples.) This section has an alignment of 4 bytes (.align 2
).
.non_lazy_symbol_pointer
This is equivalent to .section __DATA, __nl_symbol_ptr,non_lazy_symbol_pointers
This section is of type non_lazy_symbol_pointers
and has no attributes. The compiler places a non-lazy symbol pointer in this section for each undefined symbol referenced by the module (except for function calls). This section has an alignment of 4 bytes (.align 2
).
.mod_init_func
This is equivalent to .section __DATA, __mod_init_func, mod_init_funcs
This section is of type mod_init_funcs
and has no attributes. The C++ compiler places a pointer to a function in this section for each function it creates to call the constructors (if the module has them).
.mod_term_func
This is equivalent to .section __DATA, __mod_term_func, mod_term_funcs
This section is of type mod_term_funcs
and has no attributes. The C++ compiler places a pointer to a function in this section for each function it creates to call the destructors (if the module has them).
.dyld
This is equivalent to .section __DATA, __dyld,regular
This section is of type regular
and has no attributes. This section is used by the dynamic link editor. The compiler doesn’t place anything in this section, as it is reserved exclusively for the dynamic link editor.
Designating Sections in the __OBJC Segment
These directives cause the assembler to begin assembling into the indicated section of the __OBJC
segment (or the __TEXT
segment):
Directive | Section |
---|---|
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
| ( |
All sections in the __OBJC
segment, including old sections that are no longer used and future sections that may be added, are exclusively reserved for the Objective-C compiler’s use.
Directives for Moving the Location Counter
This section describes directives that advance the location counter to a location higher in memory. They have the additional effect of setting the intervening memory to some value.
.align
SYNOPSIS
.align align_expression [ , 1byte_fill_expression [,max_bytes_to_fill]] |
.p2align align_expression [ , 1byte_fill_expression [,max_bytes_to_fill]] |
.p2alignw align_expression [ , 2byte_fill_expression [,max_bytes_to_fill]] |
.p2alignl align_expression [ , 4byte_fill_expression [,max_bytes_to_fill]] |
.align32 align_expression [ , 4byte_fill_expression [,max_bytes_to_fill]] |
The align directives advance the location counter to the next align_expression boundary, if it isn't currently on such a boundary. align_expression is a power of 2 between 0 and 15 (for example, the argument of .align 3
means 2 ^ 3 (8)–byte alignment). The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the fill expression (or with zeros, if fill_expression isn't specified). The space between the current value of the location counter to the alignment of the fill expression width is filled with zeros first. Then the fill expression is used until the desired alignment is reached. max_bytes_to_fill is the maximum number of bytes that are allowed to be filled for the align directive. If the align directive can't be done in max_bytes_to_fill or less, it has no effect. If there is no fill_expression and the section has the pure_instructions
attribute, or contains some instructions, the nop
opcode is used as the fill expression.
EXAMPLE
.align 3 |
one: .double 0r1.0 |
.org
SYNOPSIS
.org expression [ , fill_expression ] |
The .org
directive sets the location counter to expression, which must be a currently known absolute expression. This directive can only move the location counter up in address. The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the low-order byte of the fill expression (or with zeros, if fill_expression isn’t specified).
EXAMPLE
.org 0x100,0xff |
Directives for Generating Data
The directives described in this section generate data. (Unless specified otherwise, the data goes into the current section.) In some respects, they are similar to the directives explained in Directives for Moving the Location Counter—they do have the effect of moving the location counter—but this isn’t their primary purpose.
.ascii and .asciz
SYNOPSIS
.ascii [ “string” ] [ , “string” ] ... |
.asciz [ “string” ] [ , “string” ] ... |
These directives translate character strings into their ASCII equivalents for use in the source program. Each directive takes zero or more comma-separated strings surrounded by quotation marks. Each string can contain any character or escape sequence that can appear in a character string; the newline character cannot appear, but it can be represented by the escape sequence \012
or \n
:
The
.ascii
directive generates a sequence of ASCII characters.The
.asciz
directive is similar to the.ascii
directive, except that it automatically terminates the sequence of ASCII characters with the null character (\0
), necessary when generating strings usable by C programs.
If no strings are specified, the directive is ignored.
EXAMPLE
.ascii "Can't open the DSP.\0" |
.asciz "%s has changes.\tSave them?" |
.byte, .short, .long, and .quad
SYNOPSIS
.byte [ expression ] [ , expression ] ... |
.short [ expression ] [ , expression ] ... |
.long [ expression ] [ , expression ] ... |
.quad [ expression ] [ , expression ] ... |
These directives reserve storage locations in the current section and initialize them with specified values. Each directive takes zero or more comma-separated absolute expressions and generates a sequence of bytes for each expression. The expressions are truncated to the size generated by the directive:
.byte
generates 1 byte per expression..short
generates 2 bytes per expression..long
generates 4 bytes per expression..quad
generates 8 bytes per expression.
EXAMPLE
.byte 74,0112,0x4A,0x4a,'J | the same byte |
.short 64206,0175316,0xface | the same short |
.long -1234,037777775456,0xfffffb2e | the same long |
.quad -1234,01777777777777777775456,0xfffffffffffffb2e | the same quad |
.comm
SYNOPSIS
.comm name, size |
The .comm
directive creates a common symbol named name of size bytes. If the symbol isn’t defined elsewhere, its type is “common.”
The link editor allocates storage for common symbols that aren’t otherwise defined. Enough space is left after the symbol to hold the maximum size (in bytes) seen for each symbol in the (__DATA,__common
) section.
The link editor aligns each such symbol (based on its size aligned to the next greater power of two) to the maximum alignment of the (__DATA,__common
) section. For information about how to change the maximum alignment, see the description of -sectalign
in the ld(1)
OS X man page.
EXAMPLE
.comm _global_uninitialized,4 |
.fill
SYNOPSIS
.fill repeat_expression , fill_size , fill_expression |
The .fill
directive advances the location counter by repeat_expression times fill_size bytes:
fill_size is in bytes, and must have the value
1
,2
, or4
repeat_expression must be an absolute expression greater than zero
fill_expression may be any absolute expression (it gets truncated to the fill size)
EXAMPLE
.fill 69,4,0xfeadface | put out 69 0xfeadface’s |
.lcomm
SYNOPSIS
.lcomm name, size [ , align ] |
The .lcomm
directive creates a symbol named name of size bytes in the (__DATA,__bss
) section. It contains zeros at execution. The name isn’t declared as global, and hence is unknown outside the object module.
The optional align expression, if specified, causes the location counter to be rounded up to an align power-of-two boundary before assigning the location counter to the value of name.
EXAMPLE
.lcomm abyte,1 | or: .lcomm abyte,1,0 |
.lcomm padding,7 |
.lcomm adouble,8 | or: .lcomm adouble,8,3 |
These are the same as:
.zerofill __DATA,__bss,abyte,1 |
.lcomm __DATA,__bss,padding,7 |
.lcomm __DATA,__bss,adouble,8 |
.single and .double
SYNOPSIS
.single [ number ] [ , number ] ... |
.double [ number ] [ , number ] ... |
These directives reserve storage locations in the current section and initialize them with specified values. Each directive takes zero or more comma-separated decimal floating-point numbers:
.single
takes IEEE single-precision floating point numbers. It reserves 4 bytes for each number and initializes them to the value of the corresponding number..double
takes IEEE double-precision floating point numbers. It reserves 8 bytes for each number and initializes them to the value of the corresponding number.
EXAMPLE
.single 3.33333333333333310000e-01 |
.double 0.00000000000000000000e+00 |
.single +Infinity |
.double -Infinity |
.single NaN |
.space
SYNOPSIS
.space num_bytes [ , fill_expression ] |
The .space
directive advances the location counter by num_bytes, where num_bytes is an absolute expression greater than zero. The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the low-order byte of the fill expression (or with zeros, if fill_expression isn’t specified).
EXAMPLE
ten_ones: |
.space 10,1 |
Directives for Dealing With Symbols
This section describes directives that have an effect on symbols and the symbol table.
.globl
SYNOPSIS
.globl symbol_name |
The .globl
directive makes symbol_name external. If symbol_name is otherwise defined (by .set
or by appearance as a label), it acts within the assembly exactly as if the .globl
statement was not given; however, the link editor may be used to combine this object module with other modules referring to this symbol.
EXAMPLE
.globl abs |
.set abs,1 |
.globl var |
var: .long 2 |
.indirect_symbol
SYNOPSIS:
.indirect_symbol symbol_name |
The .indirect_symbol
directive creates an indirect symbol withsymbol_name and associates the current location with the indirect symbol. An indirect symbol must be defined immediately before each item in a
symbol_stub, lazy_symbol_pointers,
and non_lazy_symbol_pointers
section. The static and dynamic linkers usesymbol_name to identify the symbol associated with the item following the directive.
.reference
SYNOPSIS
.reference symbol_name |
The .reference
directive causes symbol_name to be an undefined symbol present in the output file’s symbol table. This is useful in referencing a symbol without generating any bytes to do it (used, for example, by the Objective-C runtime system to reference superclass objects).
EXAMPLE
.reference .objc_class_name_Object |
.weak_reference
SYNOPSIS
.weak_reference symbol_name |
The .weak_reference
directive causes symbol_name to be a weak undefined symbol present in the output file’s symbol table. This is used by the compiler when referencing a symbol with the weak_import
attribute.
EXAMPLE
.weak_reference .objc_class_name_Object |
.lazy_reference
SYNOPSIS
.lazy_reference symbol_name |
The .lazy_reference
directive causes symbol_name to be a lazy undefined symbol present in the output file’s symbol table. This is useful when referencing a symbol without generating any bytes to do it (used, for example, by the Objective-C runtime system with the dynamic linker to reference superclass objects but allow the runtime to bind them on first use).
EXAMPLE
.lazy_reference .objc_class_name_Object |
.weak_definition
SYNOPSIS
.weak_definition symbol_name |
The .weak_definition
directive causes symbol_name to be a weak definition. symbol_name can be defined only in a coalesced
section. This is used by the C++ compiler to support template instantiation. The compiler uses a coalesced
section with the .weak_definition
directive for implicitly instantiated templates. And it uses a regular section (.text, .data
, a so on) for an explicit template instantiation.
.private_extern
SYNOPSIS:
.private_extern symbol_name |
The .private_extern
directive makes symbol_name a private external symbol. When the link editor combines this module with other modules (and the -keep_private_externs
command-line option is not specified) the symbol turns it from global to static. If both .private_extern
and .globl
assembler directives are used on the same symbol, the effect is as if only the .private_extern
directive was used.
.stabs, .stabn, and .stabd
SYNOPSIS
.stabs n_name , n_type , n_other , n_desc , n_value |
.stabn n_type , n_other , n_desc , n_value |
.stabd n_type , n_other , n_desc |
These directives are used to place symbols in the symbol table for the symbolic debugger (a “stab” is a symbol table entry).
.stabs
specifies all the fields in a symbol table entry. n_name is the name of a symbol; if the symbol name is null, the.stabn
directive may be used instead..stabn
is similar to.stabs
, except that it uses a NULL (""
) name..stabd
is similar to.stabn
, except that it uses the value of the location counter (.
) as the n_value field.
In each case, the n_type field is assumed to contain a 4.3BSD-like value for the N_TYPE bits (defined in mach-o/stab.h
). For .stabs
and .stabn
, the n_sect
field of the Mach-O file’s nlist
is set to the section number of the symbol for the specified n_value parameter. For .stabd
, the n_sect
field is set to the current section number for the location counter. The nlist
structure is defined in mach-o/nlist.h
.
EXAMPLE
.stabs "hello.c",100,0,0,Ltext |
.stabn 192,0,0,LBB2 |
.stabd 68,0,15 |
.desc
SYNOPSIS
.desc symbol_name , absolute_expression |
The .desc
directive sets the n_desc
field of the specified symbol to absolute_expression.
EXAMPLE
.desc _environ, 0x10 ; set the REFERENCED_DYNAMICALLY bit |
.set
SYNOPSIS
.set symbol_name , absolute_expression |
The .set
directive creates the symbol symbol_name and sets its value to absolute_expression. This is the same as using symbol_name=
absolute_expression.
EXAMPLE
.set one,1 |
two = 2 |
.lsym
SYNOPSIS
.lsym symbol_name , expression |
A unique and otherwise unreferenceable symbol of the symbol_name, expression pair is created in the symbol table. The symbol created is a static symbol with a type of absolute (N_ABS
). Some Fortran 77 compilers use this mechanism to communicate with the debugger.
Directives for Dead-Code Stripping
Dead-code stripping is the process by which the static link editor removes unused code and data blocks from executable files. This process helps reduce the overall size of executables, which in turn improves performance by reducing the memory footprint of the executable. It also allows programs to link successfully in the situation where unused code refers to an undefined symbol, something that would normally result in a link error. For more information on dead-code stripping, see Linking in Xcode 2.1 User Guide.
The following sections describe the dead-code stripping directives.
.subsections_via_symbols
SYNOPSIS
.subsections_via_symbols |
The .subsections_via_symbols
directive tells the static link editor that the sections of the object file can be divided into individual blocks. These blocks are then stripped if they are not used by other code. This directive applies to all section declarations in the assembly file and should be placed outside any section declarations, as shown here:
.subsections_via_symbols |
; Section declarations... |
When using this directive, ensure that each symbol in the section is at the beginning of a block of code. Implicit dependencies between blocks of code may result in the removal of needed code from the executable. For example, the following section contains three symbols, but execution of the code at _plus_three
ends at the blr
statement at the bottom of the code block:
.text |
.globl _plus_three |
_plus_three: |
addi r3, r3, 1 |
.globl _plus_two |
_plus_two: |
addi r3, r3, 1 |
.globl _plus_one |
_plus_one: |
addi r3, r3, 1 |
blr |
If you use the .subsections_via_symbols
directive on this code and _plus_two
and _plus_three
are not called by any other code, the static link editor would not add _plus_two
and _plus_one
to the executable. In that case, _plus_three
would not return the correct value because part of its implementation would be missing. In addition, if _plus_one
is dead-stripped, the program may crash when _plus_three
is executed, as it would continue executing into the following block.
.no_dead_strip
SYNOPSIS
.no_dead_strip symbol_name |
The .no_dead_strip
directive tells the assembler that the symbol specified by symbol_name must not be dead-stripped. For example, the following code prevents _my_version_string
from being dead-stripped:
.no_dead_strip _my_version_string |
.cstring |
_my_version_string: |
.ascii "Version 1.1" |
Miscellaneous Directives
This section describes additional directives that don’t fit into any of the previous sections.
.abort
SYNOPSIS
.abort [ "abort_string" ] |
The .abort
directive causes the assembler to ignore further input and quit processing. No files are created. The directive could be used, for example, in a pipe-interconnected version of a compiler—the first major syntax error would cause the compiler to issue this directive, saving unnecessary work in assembling code that would have to be discarded anyway.
The optional abort_string
is printed as part of the error message when the
.abort
directive is encountered.
EXAMPLE
#ifndef VAR |
.abort "You must define VAR to assemble this file." |
#endif |
.abs
SYNOPSIS
.abs symbol_name , expression |
This directive sets the value of symbol_name to 1
if expression is an absolute expression; otherwise, it sets the value to zero.
EXAMPLE
.macro var |
.abs is_abs,$0 |
.if is_abs==1 |
.abort "must be absolute" |
.endif |
.endmacro |
.dump and .load
SYNOPSIS
.dump filename |
.load filename |
These directives let you dump and load the absolute symbols and macro definitions for faster loading and faster assembly.
These work like this:
.include "big_file_1" |
.include "big_file_2" |
.include "big_file_3" |
... |
.include "big_file_N" |
.dump "symbols.dump" |
The .dump
directive writes out all the N_ABS symbols and macros. You can later use the .load
directive to load all the N_ABS symbols and macros faster than you could with .include
:
.load "symbols.dump" |
One useful side effect of loading symbols this way is that they aren’t written out to the object file.
.file and .line
SYNOPSIS
.file file_name |
.line line_number |
The .file
directive causes the assembler to report error messages as if it were processing the file file_name.
The .line
directive causes the assembler to report error messages as if it were processing the line line_number. The next line after the .line
directive is assumed to be line_number.
The assembler turns C preprocessor comments of the form:
# line_number file_name level |
into:
.line line_number; .file file_name |
EXAMPLE
.line 6 |
nop | this is line 6 |
.if, .elseif, .else, and .endif
SYNOPSIS
.if expression |
.elseif expression |
.else |
.endif |
These directives are used to delimit blocks of code that are to be assembled conditionally, depending on the value of an expression. A block of conditional code may be nested within another block of conditional code. expression must be an absolute expression.
For each .if
directive:
there must be a matching
.endif
there may be as many intervening
.elseif
’s as desiredthere may be no more than one intervening
.else
before the tailing.endif
Labels or multiple statements must not be placed on the same line as any of these directives; otherwise, statements including these directives are not recognized and produce errors or incorrect conditional assembly.
EXAMPLE
.if a==1 |
.long 1 |
.elseif a==2 |
.long 2 |
.else |
.long 3 |
.endif |
.include
SYNOPSIS
.include "filename" |
The .include
directive causes the named file to be included at the current point in the assembly. The -Idir
option to the assembler specifies alternative paths to be used in searching for the file if it isn’t found in the current directory.
EXAMPLE
.include "macros.h" |
.machine
SYNOPSIS
.machine arch_type |
The .machine
directive specifies the target architecture of the assembly file. arch_type can be any architecture type you can specify in the -arch
option of the assembler driver. See Assembler Options for more information.
.macro, .endmacro, .macros_on, and .macros_off
SYNOPSIS
.macro |
.endmacro |
.macros_on |
.macros_off |
These directives allow you to define simple macros (once a macro is defined, however, you can’t redefine it). For example:
.macro var |
instruction_1 $0,$1 |
instruction_2 $2 |
. . . |
instruction_N |
.long $n |
.endmacro |
$
d (where d is a single decimal digit, 0 through 9) represents each argument—there can be at most 10 arguments. $
n is replaced by the actual number of arguments the macro is invoked with.
When you use a macro, arguments are separated by a comma (except inside matching parentheses—for example, xxx(1,3,4),yyy
contains only two arguments). You could use the macro defined above as follows:
var #0,@sp,4 |
This would be expanded to:
instruction_1 #0,@sp |
instruction_2 4 |
. . . |
instruction_N |
.long 3 |
The directives .macros_on
and .macros_off
allow macros to be written that override an instruction or directive while still using the instruction or directive. For example:
.macro .long |
.macros_off |
.long $0,$0 |
.macros_on |
.endmacro |
If you don’t specify an argument, the macro substitutes nothing (see .abs).
PowerPC-Specific Directives
The following directives are specific to the PowerPC architecture.
.flag_reg
SYNOPSIS
.flag_reg reg_number |
This causes the uses of the reg_number general register to get flagged as warnings. This is intended for use in macros.
.greg
SYNOPSIS
.greg symbol_name, expression... |
This directive sets symbol_name to 1
when expression is a general register or zero otherwise. It is intended for use in macros.
.no_ppc601
SYNOPSIS
This causes PowerPC 601 instructions to be flagged as errors. This is the same as if the -no_ppc601
option is specified.
.noflag_reg
SYNOPSIS
.noflag_reg reg_number |
This turns off the flagging of the uses of the reg_number general register so they don’t get flagged as warnings. This is intended for use in macros.
Additional Processor-Specific Directives
The following processor-specific directives are synonyms for other standard directives described earlier in this chapter; although they are listed here for completeness, their use isn’t recommended. Wherever possible, you should use the standard directive instead.
The following are i386-specific directives:
i386 Directive | Standard Directive |
---|---|
|
|
|
|
|
|
|
|
|
|
| (ignored) |
| (ignored) |
| (ignored) |
| (ignored) |
| (ignored) |
Copyright © 2003, 2009 Apple Inc. All Rights Reserved. Terms of Use | Privacy Policy | Updated: 2009-01-07