Assembler Directives

This chapter describes assembler directives (also known as pseudo operations, or pseudo-ops), which allow control over the actions of the assembler.

Directives for Designating the Current Section

The assembler supports designation of arbitrary sections with the .section and .zerofill directives (descriptions appear below). Only those sections specified by a directive in the assembly file appear in the resulting object file (including implicit .text directives—see Built-in Directives. Sections appear in the object file in the order their directives first appear in the assembly file. When object files are linked by the link editor, the output objects have their sections in the order the sections first appear in the object files that are linked. See the ld(1) OS X man page for more details.

Associated with each section in each segment is an implicit location counter, which begins at zero and is incremented by 1 for each byte assembled into the section. There is no way to explicitly reference a particular location counter, but the directives described here can be used to “activate” the location counter for a section, making it the current location counter. As a result, the assembler begins assembling into the section associated with that location counter.

.section

SYNOPSIS

.section  segname , sectname [[[ , type ] , attribute ] , sizeof_stub ]

The .section directive causes the assembler to begin assembling into the section given by segname and sectname. A section created with this directive contains initialized data or instructions and is referred to as a content section. type and attribute may be specified as described under Section Types and Attributes. If type is symbol_stubs, then the sizeof_stub field must be given as the size in bytes of the symbol stubs contained in the section.

.zerofill

SYNOPSIS

.zerofill  segname , sectname [ , symbolname , size [ , align_expression ]]

The .zerofill directive causes symbolname to be created as uninitialized data in the section given by segname and sectname, with a size in bytes given by size. A power of 2 between 0 and 15 may be given for align_expression to indicate what alignment should be forced on symbolname, which is placed on the next expression boundary having the given alignment. See .align for details.

Section Types and Attributes

A content section has a type, which informs the link editor about special processing needed for the items in that section. The most common form of special processing is for sections containing literals (strings, constants, and so on) where only one copy of the literal is needed in the output file and the same literal can be used by all references in the input files.

A section’s attributes record supplemental information about the section that the link editor may use in processing that section. For example, the pure_instructions attribute indicates that a section contains only valid machine instructions.

A section’s type and attribute are recorded in a Mach-O file as the flags field in the section header, using constants defined in the header file mach-o/loader.h. The following sections describe the various types and attributes by the names used to identify them in a .section directive. The name of the related constant is also given in parentheses following the identifier.

Type Identifiers

The following sections describe section type identifiers.

regular (S_REGULAR)

A regular section may contain any kind of data and gets no special processing from the link editor. This is the default section type. Examples of regular sections include program instructions or initialized data.

cstring_literals (S_CSTRING_LITERALS)

A cstring_literals section contains null-terminated literal C language character strings. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. The last byte in a section of this type must be a null byte, and the strings can’t contain null bytes in their bodies. An example of a cstring_literals section is one for the literal strings that appear in the body of an ANSI C function where the compiler chooses to make such strings read only.

4byte_literals (S_4BYTE_LITERALS)

A 4byte_literals section contains 4-byte literal constants. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. An example of a 4byte_literals section is one in which single-precision floating-point constants are stored for a RISC machine (these would normally be stored as immediates in CISC machine code).

8byte_literals (S_8BYTE_LITERALS)

An 8byte_literals section contains 8-byte literal constants. The link editor places only one copy of each literal into the output file’s section and relocates references to different copies of the same literal to the one copy in the output file. There can be no relocation entries for a section of this type, and all references to literals in this section must be inside the address range for the specific literal being referenced. An example of a 8byte_literals section is one in which double-precision floating-point constants are stored for a RISC machine (these would normally be stored as immediates in CISC machine code).

literal_pointers (S_LITERAL_POINTERS)

A literal_pointers section contains 4-byte pointers to literals in a literal section. The link editor places only one copy of a pointer into the output file’s section for each pointer to a literal with the same contents. The link editor also relocates references to each literal pointer to the one copy in the output file. There must be exactly one relocation entry for each literal pointer in this section, and all references to literals in this section must be inside the address range for the specific literal being referenced. The relocation entries can be external relocation entries referring to undefined symbols if those symbols identify literals in another object file. An example of a literal_pointers section is one containing selector references generated by the Objective-C compiler.

symbol_stubs (S_SYMBOL_STUBS)

A symbol_stubs section contains symbol stubs, which are sequences of machine instructions (all the same size) used for lazily binding undefined function calls at runtime. If a call to an undefined function is made, the compiler outputs a call to a symbol stub instead, and tags the stub with an indirect symbol that indicates what symbol the stub is for. On transfer to a symbol stub, a program executes instructions that eventually reach the code for the indirect symbol associated with that stub. Here’s a sample of assembly code based on a function func() containing only a call to the undefined function foo():

    .text
    .align 2
    .globl _func
 _func:
    b L_foo$stub
    .symbol_stub
 L_foo$stub:                            ;
    .indirect_symbol _foo               ;
    lis r11,ha16(L_foo$lazy_ptr)        ;
    lwz r12,lo16(L_foo$lazy_ptr)(r11)   ; the symbol stub
    mtctr r12                           ;
    addi r11,r11,lo16(L_foo$lazy_ptr)   ;
    bctr                                ;
    .lazy_symbol_pointer
 L_foo$lazy_ptr:                        ;
    .indirect_symbol _foo               ; the symbol pointer
    .long dyld_stub_binding_helper      ; to be replaced by _foo's address

The symbol-stub sections in the IA-32 architecture—instead of using a stub and a lazy pointer—use one branch instruction that specifies the target. This is the corresponding IA-32 assembly code:

    .text
    .align 2
    .globl _func
_func:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
    call    L_foo$stub
    leave
    ret
    .symbol_stub
L_foo$stub:
     .indirect_symbol _foo
     hlt ; hlt ; hlt ; hlt ; hlt

In the assembly code, _func branches to L_foo$stub, which is responsible for finding the definition of the function foo(). On PPC (and PPC64), L_foo$stub jumps to the contents of L_foo$lazy_ptr. This value is initially the address of the dyld_stub_binding_helper code, which after executing causes it to overwrite the contents of L_foo$lazy_ptr with the address of the real function, _foo, and jump to _foo.

On IA-32, the branch instruction points to the dynamic linker. The first time the stub is called, the dynamic linker modifies the instruction so that it jumps to the real function in subsequent calls.

The indirect symbol entries for _foo provide information to the static and dynamic linkers for binding the symbol stub. Each symbol stub and lazy pointer entry must have exactly one such indirect symbol, associated with the first address in the stub or pointer entry. See .indirect_symbol for more information.

The static link editor places only one copy of each stub into the output file’s section for a particular indirect symbol, and relocates all references to the stubs with the same indirect symbol to the stub in the output file. Further, the static link editor eliminates a stub if it determines that the target is in the same linkage unit and doesn’t need redirecting at runtime. No global symbols can be defined in symbol_stubs sections.

On PPC, the stub can refer only to itself, one lazy symbol pointer (referring to the same indirect symbol as the stub), and the dyld_stub_binding_helper() function.

lazy_symbol_pointers (S_LAZY_SYMBOL_POINTERS)

A lazy_symbol_pointers section contains 4-byte symbol pointers that eventually contain the value of the indirect symbol associated with the pointer. These pointers are used by symbol stubs to lazily bind undefined function calls at runtime. A lazy symbol pointer initially contains an address in the symbol stub of instructions that cause the symbol pointer to be bound to the function definition (in the example in symbol_stubs (S_SYMBOL_STUBS), the lazy pointer L_foo$lazy_ptr initially contains the address for dyld_stub_binding_helper but gets overwritten with the address for _foo). The dynamic link editor binds the indirect symbol associated with the lazy symbol pointer by overwriting it with the value of the symbol.

The static link editor places a copy of a lazy pointer in the output file only if the corresponding symbol stub is in the output file. Only the corresponding symbol stub can make a reference to a lazy symbol pointer, and no global symbols can be defined in this type of section. There must be one indirect symbol associated with each lazy symbol pointer. An example of a lazy_symbol_pointers section is one in which the compiler has generated calls to undefined functions, each of which can be bound lazily at the time of the first call to the function.

non_lazy_symbol_pointers (S_NON_LAZY_SYMBOL_POINTERS)

A non_lazy_symbol_pointers section contains 4-byte symbol pointers that contain the value of the indirect symbol associated with a pointer that may be set at any time before any code makes a reference to it. These pointers are used by the code to reference undefined symbols. Initially these pointers have no interesting value but get overwritten by the dynamic link editor with the value of the symbol for the associated indirect symbol before any code can make a reference to it.

The static link editor places only one copy of each non-lazy pointer for its indirect symbol into the output file and relocates all references to the pointer with the same indirect symbol to the pointer in the output file. The static link editor further can fill in the pointer with the value of the symbol if a definition of the indirect symbol for that pointer is present in the output file. No global symbols can be defined in this type of section. There must be one indirect symbol associated with each non-lazy symbol pointer. An example of a non_lazy_symbol_pointers section is one in which the compiler has generated code to indirectly reference undefined symbols to be bound at runtime—this preserves the sharing of the machine instructions by allowing the dynamic link editor to update references without writing on the instructions.

Here's an example of assembly code referencing an element in the undefined structure. The corresponding C code would be:

struct s {
        int member1, member2;
    };
    extern struct s bar;
    int func()
    {
        return(bar.member2);
    }

The PowerPC assembly code might look like this:

    .text
    .align 2
    .globl _func
 _func:
    lis r3,ha16(L_bar$non_lazy_ptr)
    lwz r2,lo16(L_bar$non_lazy_ptr)(r3)
    lwz r3,4(r2)
    blr
    .non_lazy_symbol_pointer
 L_bar$non_lazy_ptr:
    .indirect_symbol _bar
    .long 0
mod_init_funcs (S_MOD_INIT_FUNC_POINTERS)

A mod_init_funcs section contains 4-byte pointers to functions that are to be called just after the module containing the pointer is bound into the program by the dynamic link editor. The static link editor does no special processing for this section type except for disallowing section ordering. This is done to maintain the order the functions are called (which is the order their pointers appear in the original module). There must be exactly one relocation entry for each pointer in this section. An example of a mod_init_funcs section is one in which the compiler has generated code to call C++ constructors for modules that get dynamically bound at runtime.

mod_term_funcs (S_MOD_TERM_FUNC_POINTERS)

A mod_term_funcs section contains 4-byte pointers to functions that are to be called just before the module containing the pointer is unloaded by the dynamic link editor or the program is terminated. The static link editor does no special processing for this section type except for disallowing section ordering. This is done to maintain the order the functions are called (which is the order their pointers appear in the original module). There must be exactly one relocation entry for each pointer in this section. An example of a mod_term_funcs section is one in which the compiler has generated code to call C++ destructors for modules that get dynamically bound at runtime.

coalesced (S_COALESCED)

A coalesced section can contain any instructions or data and is used when more than one definition of a symbol could be defined in multiple object files being linked together. The static link editor keeps the data associated with the coalesced symbol from the first object file it links and silently discards the data from other object files. An example of a coalesced section is one in which the compiler has generated code for implicit instantiations of C++ templates.

Attribute Identifiers

The following sections describe attribute identifiers.

none (0)

No attributes for this section. This is the default section attribute.

S_ATTR_SOME_INSTRUCTIONS

This attribute is set by the assembler whenever it assembles a machine instruction in a section. There is no directive associated with it, since you cannot set it yourself. It is used by the dynamic link editor together with S_ATTR_EXT_RELOC and S_ATTR_LOC_RELOC, set by the static link editor, to know it must flush the cache and other processor-related functions when it relocates instructions by writing on them.

no_dead_strip (S_ATTR_NO_DEAD_STRIP)

The no_dead_strip section attribute specifies that a particular section must not be dead-stripped. See Directives for Dead-Code Stripping for more information.

no_toc (S_ATTR_NO_TOC)

The no_toc section attribute means that the global symbols in this section are not to be used in the table of contents of a static library as produced by the program ranlib(1). This is normally used with a coalesced section when it is expected that each object file has a definition of the symbols that it uses.

live_support (S_ATTR_LIVE_SUPPORT)

The live_support section attribute specifies that a section’s blocks must not be dead-stripped if they reference code that is live, but the reference is undetectable. See Directives for Dead-Code Stripping for more information.

pure_instructions (S_ATTR_PURE_INSTRUCTIONS)

The pure_instructions attribute means that this section contains nothing but machine instructions. This attribute would be used for the (__TEXT,__text) section of OS X compilers and sections that have a section type of symbol_stubs.

strip_static_syms (S_ATTR_STRIP_STATIC_SYMS)

The strip_static_syms section attribute means that the static symbols in this section can be stripped from linked images that are used with the dynamic linker when debugging symbols are also stripped. This is normally used with a coalesced section that has private extern symbols, so that after linking and the private extern symbols have been turned into static symbols they can be stripped to save space in the linked image.

self_modifying_code (S_ATTR_SELF_MODIFYING_CODE)

The self_modifying_code section attribute identifies a section with code that can be modified by the dynamic linker. For example, IA-32 symbol stubs are implemented as branch instructions that initially point to the dynamic linker but are modified by the dynamic linker to point to the real symbol.

Built-in Directives

The directives described here are simply built-in equivalents for .section directives with specific arguments.

Designating Sections in the __TEXT Segment

The directives listed below cause the assembler to begin assembling into the indicated section of the __TEXT segment. Note that the underscore before __TEXT, __text, and the rest of the segment names is actually two underscore characters.

Directive

Section

.text

(__TEXT,__text)

.const

(__TEXT,__const)

.static_const

(__TEXT,__static_const)

.cstring

(__TEXT,__cstring)

.literal4

(__TEXT,__literal4)

.literal8

(__TEXT,__literal8)

.literal16

(__TEXT,__literal16)

.constructor

(__TEXT,__constructor)

.destructor

(__TEXT,__destructor)

.fvmlib_init0

(__TEXT,__fvmlib_init0)

.fvmlib_init1

(__TEXT,__fvmlib_init1)

.symbol_stub

(__TEXT,__symbol_stub1 or __TEXT,__jump_table)

.picsymbol_stub

(__TEXT, __picsymbolstub1 or __TEXT, __picsymbol_stub)

The following sections describe the sections in the __TEXT segment and the types of information that should be assembled into each of them.

.text

This is equivalent to .section __TEXT,__text,regular,pure_instructions when the default -dynamic flag is in effect and equivalent to .section __TEXT,__text,regular when the -static flag is specified.

The compiler places only machine instructions in the (__TEXT,__text) section (no read-only data, jump tables or anything else). With this, the entire (__TEXT,__text) section is pure instructions and tools that operate on object files. The runtime can take advantage of this to locate the instructions of the program and not get confused with data that could have been mixed in. To make this work, all runtime support code linked into the program must also obey this rule (all OS X library code follows this rule).

.const

This is equivalent to .section __TEXT,__const

The compiler places all data declared const and all jump tables it generates for switch statements in this section.

.static_const

This is equivalent to .section __TEXT,__static_const

This is not currently used by the compiler. It was added to the assembler so that the compiler may separate global and static const data into separate sections if it wished to.

.cstring

This is equivalent to .section __TEXT,__cstring, cstring_literals

This section is marked with the section type cstring_literals, which the link editor recognizes. The link editor merges the like literal C strings in all the input object files to one unique C string in the output file. Therefore this section must contain only C strings (a C string in a sequence of bytes that ends in a null byte, \0, and does not contain any other null bytes except its terminator). The compiler places literal C strings found in the code that are not initializers and do not contain any embedded nulls in this section.

.literal4

This is equivalent to .section __TEXT,__literal4,4byte_literals

This section is marked with the section type 4byte_literals, which the link editor recognizes. The link editor can then merge the like 4 byte literals in all the input object files to one unique 4 byte literal in the output file. Therefore, this section must contain only 4 byte literals. This is typically intended for single precision floating-point constants and the compiler uses this section for that purpose. On some architectures it is more efficient to place these constants in line as immediates as part of the instruction.

.literal8

This is equivalent to .section __TEXT,__literal8,8byte_literals

This section is marked with the section type 8byte_literals, which the link editor recognizes. The link editor then can merge the like 8 byte literals in all the input object files to one unique 8 byte literal in the output file. Therefore, this section must contain only 8 byte literals. This is typically intended for double precision floating-point constants and the compiler uses this section for that purpose. On some architectures it is more efficient to place these constants in line as immediates as part of the instruction.

.literal16

This is equivalent to .section __TEXT,__literal16,16byte_literals

This section is marked with the section type 16byte_literals, which the link editor recognizes. The link editor can then merge the like 16 byte literals in all the input object files to one unique 16 byte literal in the output file. Therefore, this section must contain only 16 byte literals. This is typically intended for vector constants and the compiler uses this section for that purpose.

.constructor

This is equivalent to .section __TEXT,__constructor

.destructor

This is equivalent to .section __TEXT,__destructor

The .constructor and .destructor sections are used by the C++ runtime system, and are reserved exclusively for the C++ compiler.

.fvmlib_init0

This is equivalent to .section __TEXT,__fvmlib_init0

.fvmlib_init1

This is equivalent to .section __TEXT,__fvmlib_init1

The .fvmlib_init0 and .fvmlib_init1 sections are used by the obsolete fixed virtual memory shared library initialization. The compiler doesn't place anything in these sections, as they are reserved exclusively for the obsolete shared library mechanism.

.symbol_stub

This section is of type symbol_stubs and has the attribute pure_instructions. The compiler places symbol stubs in this section for undefined functions that are called in the module. This is the standard symbol stub section for nonposition-independent code.

Symbol stubs are implemented differently on PPC (and PPC64) and on IA-32. The following sections describe each implementation.

PowerPC .symbol_stub

On PowerPC (PPC and PPC64), .symbol_stub is equivalent to .section __TEXT,__symbol_stub1, symbol_stubs, pure_instructions, 20.

The standard symbol stub on PPC and PPC64 is 20 bytes and has an alignment of 4 bytes (.align 2). For example, a stub for the symbol _foo would be (using a lazy symbol pointer L_foo$lazy_ptr):

        .symbol_stub
L_foo$stub:
        .indirect_symbol _foo
        lis     r11,ha16(L_foo$lazy_ptr)
        lwz     r12,lo16(L_foo$lazy_ptr)(r11)
        mtctr   r12
        addi    r11,r11,lo16(L_foo$lazy_ptr)
        bctr
 
        .lazy_symbol_pointer
L_foo$lazy_ptr:
        .indirect_symbol _foo
        .long   dyld_stub_binding_helper
IA-32 symbol stubs

On IA-32, symbol stubs directly use .section __IMPORT,__jump_table, symbol_stubs, self_modifying_code + pure_instructions, 5. The built-in directive .symbol_stub is no longer used.

On IA-32 this section has an additional attribute, self_modifying_code, which specifies that the code in this section can be modified at runtime. At runtime, the dynamic linker uses this feature in IA-32 stubs to change the branch instruction in the stub so that it jumps to the real symbol instead of their initial target, the dynamic linker itself. This is an example of a symbol stub of the _foo symbol:

        .section __IMPORT,__jump_table,symbol_stubs,self_modifying_code+pure_instructions,5
L_foo$stub:
        .indirect_symbol _foo
        hlt ; hlt ; hlt ; hlt ; hlt
.picsymbol_stub

In PowerPC, this directive translates to .section __TEXT, __picsymbolstub1, symbol_stubs, pure_instructions, NBYTES.

This section is of type symbol_stubs and has the attribute pure_instructions. The compiler places symbol stubs in this section for undefined functions that are called in the module. This is the standard symbol stub section for position-independent code. The value of NBYTES is dependent on the target architecture.

The standard position-independent symbol stub for the PowerPC is 36 bytes and has an alignment of 4 bytes (.align 2). For example, a stub for thesymbol _foo would be (using a lazy symbol pointer L_foo$lazy_ptr):

    .picsymbol_stub
 L_foo$stub:
    .indirect_symbol _foo
    mflr r0
    bcl 20,31,L0$_foo
 L0$_foo:
    mflr r11
    addis r11,r11,ha16(L_foo$lazy_ptr - L0$_foo)
    mtlr r0
    lwz r12,lo16(L_foo$lazy_ptr - L0$_foo)(r11)
    mtctr r12
    addi r11,r11,lo16(L_foo$lazy_ptr - L0$_foo)
    bctr

Designating Sections in the __DATA Segment

These directives cause the assembler to begin assembling into the indicated section of the __DATA segment:

Directive

Section

.data

(__DATA,__data)

.static_data

(__DATA,__static_data)

.non_lazy_symbol_pointer

(__DATA,__nl_symbol_pointer)

.lazy_symbol_pointer

(__DATA,__la_symbol_pointer)

.dyld

(__DATA,__dyld)

.mod_init_func

(__DATA,__mod_init_func)

.mod_term_func

(__DATA,__mod_term_func)

.const_data

(__DATA,__const)

The following sections describe the sections in the __DATA segment and the types of information that should be assembled into each of them.

.data

This is equivalent to .section __DATA, __data

The compiler places all non-const initialized data (even initialized to zero) in this section.

.static_data

This is equivalent to .section __DATA, __static_data

This is not currently used by the compiler. It was added to the assembler so that the compiler could separate global and static data symbol into separate sections if it wished to.

.const_data

This is equivalent to .section __DATA, __const, regular.

This section is of type regular and has no attributes. This section is used when dynamic code is being compiled for const data that must be initialized.

.lazy_symbol_pointer

This is equivalent to .section __DATA, __la_symbol_ptr,lazy_symbol_pointers

This section is of type lazy_symbol_pointers and has no attributes. The compiler places a lazy symbol pointer in this section for each symbol stub it creates for undefined functions that are called in the module. (See .symbol_stub for examples.) This section has an alignment of 4 bytes (.align 2).

.non_lazy_symbol_pointer

This is equivalent to .section __DATA, __nl_symbol_ptr,non_lazy_symbol_pointers

This section is of type non_lazy_symbol_pointers and has no attributes. The compiler places a non-lazy symbol pointer in this section for each undefined symbol referenced by the module (except for function calls). This section has an alignment of 4 bytes (.align 2).

.mod_init_func

This is equivalent to .section __DATA, __mod_init_func, mod_init_funcs

This section is of type mod_init_funcs and has no attributes. The C++ compiler places a pointer to a function in this section for each function it creates to call the constructors (if the module has them).

.mod_term_func

This is equivalent to .section __DATA, __mod_term_func, mod_term_funcs

This section is of type mod_term_funcs and has no attributes. The C++ compiler places a pointer to a function in this section for each function it creates to call the destructors (if the module has them).

.dyld

This is equivalent to .section __DATA, __dyld,regular

This section is of type regular and has no attributes. This section is used by the dynamic link editor. The compiler doesn’t place anything in this section, as it is reserved exclusively for the dynamic link editor.

Designating Sections in the __OBJC Segment

These directives cause the assembler to begin assembling into the indicated section of the __OBJC segment (or the __TEXT segment):

Directive

Section

.objc_class

(__OBJC,__class)

.objc_meta_class

(__OBJC,__meta_class)

.objc_cat_cls_meth

(__OBJC,__cat_cls_meth)

.objc_cat_inst_meth

(__OBJC,__cat_inst_meth)

.objc_protocol

(__OBJC,__protocol)

.objc_string_object

(__OBJC,__string_object)

.objc_cls_meth

(__OBJC,__cls_meth)

.objc_inst_meth

(__OBJC,__inst_meth)

.objc_cls_refs

(__OBJC,__cls_refs)

.objc_message_refs

(__OBJC,__message_refs)

.objc_symbols

(__OBJC,__symbols)

.objc_category

(__OBJC,__category)

.objc_class_vars

(__OBJC,__class_vars)

.objc_instance_vars

(__OBJC,__instance_vars)

.objc_module_info

(__OBJC,__module_info)

.objc_class_names

(__TEXT,__cstring)

.objc_meth_var_types

(__TEXT,__cstring)

.objc_meth_var_names

(__TEXT,__cstring)

.objc_selector_strs

(__OBJC,__selector_strs)

All sections in the __OBJC segment, including old sections that are no longer used and future sections that may be added, are exclusively reserved for the Objective-C compiler’s use.

Directives for Moving the Location Counter

This section describes directives that advance the location counter to a location higher in memory. They have the additional effect of setting the intervening memory to some value.

.align

SYNOPSIS

.align    align_expression [ , 1byte_fill_expression [,max_bytes_to_fill]]
.p2align  align_expression [ , 1byte_fill_expression [,max_bytes_to_fill]]
.p2alignw align_expression [ , 2byte_fill_expression [,max_bytes_to_fill]]
.p2alignl align_expression [ , 4byte_fill_expression [,max_bytes_to_fill]]
.align32  align_expression [ , 4byte_fill_expression [,max_bytes_to_fill]]

The align directives advance the location counter to the next align_expression boundary, if it isn't currently on such a boundary. align_expression is a power of 2 between 0 and 15 (for example, the argument of .align 3 means 2 ^ 3 (8)–byte alignment). The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the fill expression (or with zeros, if fill_expression isn't specified). The space between the current value of the location counter to the alignment of the fill expression width is filled with zeros first. Then the fill expression is used until the desired alignment is reached. max_bytes_to_fill is the maximum number of bytes that are allowed to be filled for the align directive. If the align directive can't be done in max_bytes_to_fill or less, it has no effect. If there is no fill_expression and the section has the pure_instructions attribute, or contains some instructions, the nop opcode is used as the fill expression.

EXAMPLE

.align 3
one:    .double 0r1.0

.org

SYNOPSIS

.org  expression [ , fill_expression ]

The .org directive sets the location counter to expression, which must be a currently known absolute expression. This directive can only move the location counter up in address. The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the low-order byte of the fill expression (or with zeros, if fill_expression isn’t specified).

EXAMPLE

.org 0x100,0xff

Directives for Generating Data

The directives described in this section generate data. (Unless specified otherwise, the data goes into the current section.) In some respects, they are similar to the directives explained in Directives for Moving the Location Counter—they do have the effect of moving the location counter—but this isn’t their primary purpose.

.ascii and .asciz

SYNOPSIS

.ascii  [ “string” ] [ , “string” ] ...
.asciz  [ “string” ] [ , “string” ] ...

These directives translate character strings into their ASCII equivalents for use in the source program. Each directive takes zero or more comma-separated strings surrounded by quotation marks. Each string can contain any character or escape sequence that can appear in a character string; the newline character cannot appear, but it can be represented by the escape sequence \012 or \n:

  • The .ascii directive generates a sequence of ASCII characters.

  • The .asciz directive is similar to the .ascii directive, except that it automatically terminates the sequence of ASCII characters with the null character (\0), necessary when generating strings usable by C programs.

If no strings are specified, the directive is ignored.

EXAMPLE

.ascii "Can't open the DSP.\0"
.asciz "%s has changes.\tSave them?"

.byte, .short, .long, and .quad

SYNOPSIS

.byte  [ expression ] [ , expression ] ...
.short  [ expression ] [ , expression ] ...
.long  [ expression ] [ , expression ] ...
.quad [ expression ] [ , expression ] ...

These directives reserve storage locations in the current section and initialize them with specified values. Each directive takes zero or more comma-separated absolute expressions and generates a sequence of bytes for each expression. The expressions are truncated to the size generated by the directive:

  • .byte generates 1 byte per expression.

  • .short generates 2 bytes per expression.

  • .long generates 4 bytes per expression.

  • .quad generates 8 bytes per expression.

EXAMPLE

.byte  74,0112,0x4A,0x4a,'J                             | the same byte
.short 64206,0175316,0xface                             | the same short
.long  -1234,037777775456,0xfffffb2e                    | the same long
.quad  -1234,01777777777777777775456,0xfffffffffffffb2e | the same quad

.comm

SYNOPSIS

.comm  name, size

The .comm directive creates a common symbol named name of size bytes. If the symbol isn’t defined elsewhere, its type is “common.”

The link editor allocates storage for common symbols that aren’t otherwise defined. Enough space is left after the symbol to hold the maximum size (in bytes) seen for each symbol in the (__DATA,__common) section.

The link editor aligns each such symbol (based on its size aligned to the next greater power of two) to the maximum alignment of the (__DATA,__common) section. For information about how to change the maximum alignment, see the description of -sectalign in the ld(1) OS X man page.

EXAMPLE

.comm _global_uninitialized,4

.fill

SYNOPSIS

.fill repeat_expression , fill_size , fill_expression

The .fill directive advances the location counter by repeat_expression times fill_size bytes:

  • fill_size is in bytes, and must have the value 1, 2, or 4

  • repeat_expression must be an absolute expression greater than zero

  • fill_expression may be any absolute expression (it gets truncated to the fill size)

EXAMPLE

.fill 69,4,0xfeadface   | put out 69 0xfeadface’s

.lcomm

SYNOPSIS

.lcomm  name, size [ , align ]

The .lcomm directive creates a symbol named name of size bytes in the (__DATA,__bss) section. It contains zeros at execution. The name isn’t declared as global, and hence is unknown outside the object module.

The optional align expression, if specified, causes the location counter to be rounded up to an align power-of-two boundary before assigning the location counter to the value of name.

EXAMPLE

.lcomm abyte,1     | or:  .lcomm abyte,1,0
.lcomm padding,7
.lcomm adouble,8   | or:  .lcomm adouble,8,3

These are the same as:

.zerofill __DATA,__bss,abyte,1
.lcomm __DATA,__bss,padding,7
.lcomm __DATA,__bss,adouble,8

.single and .double

SYNOPSIS

.single  [ number ] [ , number ] ...
.double  [ number ] [ , number ] ...

These directives reserve storage locations in the current section and initialize them with specified values. Each directive takes zero or more comma-separated decimal floating-point numbers:

  • .single takes IEEE single-precision floating point numbers. It reserves 4 bytes for each number and initializes them to the value of the corresponding number.

  • .double takes IEEE double-precision floating point numbers. It reserves 8 bytes for each number and initializes them to the value of the corresponding number.

EXAMPLE

.single 3.33333333333333310000e-01
.double 0.00000000000000000000e+00
.single +Infinity
.double -Infinity
.single NaN

.space

SYNOPSIS

.space  num_bytes [ , fill_expression ]

The .space directive advances the location counter by num_bytes, where num_bytes is an absolute expression greater than zero. The fill expression, if specified, must be absolute. The space between the current value of the location counter and the desired value is filled with the low-order byte of the fill expression (or with zeros, if fill_expression isn’t specified).

EXAMPLE

ten_ones:
          .space 10,1

Directives for Dealing With Symbols

This section describes directives that have an effect on symbols and the symbol table.

.globl

SYNOPSIS

.globl  symbol_name

The .globl directive makes symbol_name external. If symbol_name is otherwise defined (by .set or by appearance as a label), it acts within the assembly exactly as if the .globl statement was not given; however, the link editor may be used to combine this object module with other modules referring to this symbol.

EXAMPLE

.globl abs
      .set abs,1
 
      .globl var
var:  .long 2

.indirect_symbol

SYNOPSIS:

.indirect_symbol symbol_name

The .indirect_symbol directive creates an indirect symbol withsymbol_name and associates the current location with the indirect symbol. An indirect symbol must be defined immediately before each item in a symbol_stub, lazy_symbol_pointers, and non_lazy_symbol_pointers section. The static and dynamic linkers usesymbol_name to identify the symbol associated with the item following the directive.

.reference

SYNOPSIS

.reference  symbol_name

The .reference directive causes symbol_name to be an undefined symbol present in the output file’s symbol table. This is useful in referencing a symbol without generating any bytes to do it (used, for example, by the Objective-C runtime system to reference superclass objects).

EXAMPLE

.reference  .objc_class_name_Object

.weak_reference

SYNOPSIS

.weak_reference  symbol_name

The .weak_reference directive causes symbol_name to be a weak undefined symbol present in the output file’s symbol table. This is used by the compiler when referencing a symbol with the weak_import attribute.

EXAMPLE

.weak_reference  .objc_class_name_Object

.lazy_reference

SYNOPSIS

.lazy_reference  symbol_name

The .lazy_reference directive causes symbol_name to be a lazy undefined symbol present in the output file’s symbol table. This is useful when referencing a symbol without generating any bytes to do it (used, for example, by the Objective-C runtime system with the dynamic linker to reference superclass objects but allow the runtime to bind them on first use).

EXAMPLE

.lazy_reference  .objc_class_name_Object

.weak_definition

SYNOPSIS

.weak_definition symbol_name

The .weak_definition directive causes symbol_name to be a weak definition. symbol_name can be defined only in a coalesced section. This is used by the C++ compiler to support template instantiation. The compiler uses a coalesced section with the .weak_definition directive for implicitly instantiated templates. And it uses a regular section (.text, .data, a so on) for an explicit template instantiation.

.private_extern

SYNOPSIS:

.private_extern symbol_name

The .private_extern directive makes symbol_name a private external symbol. When the link editor combines this module with other modules (and the -keep_private_externs command-line option is not specified) the symbol turns it from global to static. If both .private_extern and .globl assembler directives are used on the same symbol, the effect is as if only the .private_extern directive was used.

.stabs, .stabn, and .stabd

SYNOPSIS

.stabs  n_name , n_type , n_other , n_desc , n_value
.stabn  n_type , n_other , n_desc , n_value
.stabd  n_type , n_other , n_desc

These directives are used to place symbols in the symbol table for the symbolic debugger (a “stab” is a symbol table entry).

  • .stabs specifies all the fields in a symbol table entry. n_name is the name of a symbol; if the symbol name is null, the .stabn directive may be used instead.

  • .stabn is similar to .stabs, except that it uses a NULL ("") name.

  • .stabd is similar to .stabn, except that it uses the value of the location counter (.) as the n_value field.

In each case, the n_type field is assumed to contain a 4.3BSD-like value for the N_TYPE bits (defined in mach-o/stab.h). For .stabs and .stabn, the n_sect field of the Mach-O file’s nlist is set to the section number of the symbol for the specified n_value parameter. For .stabd, the n_sect field is set to the current section number for the location counter. The nlist structure is defined in mach-o/nlist.h.

EXAMPLE

.stabs  "hello.c",100,0,0,Ltext
.stabn  192,0,0,LBB2
.stabd  68,0,15

.desc

SYNOPSIS

.desc  symbol_name , absolute_expression

The .desc directive sets the n_desc field of the specified symbol to absolute_expression.

EXAMPLE

.desc _environ, 0x10 ; set the REFERENCED_DYNAMICALLY bit

.set

SYNOPSIS

.set  symbol_name , absolute_expression

The .set directive creates the symbol symbol_name and sets its value to absolute_expression. This is the same as using symbol_name=absolute_expression.

EXAMPLE

.set one,1
two = 2

.lsym

SYNOPSIS

.lsym  symbol_name , expression

A unique and otherwise unreferenceable symbol of the symbol_name, expression pair is created in the symbol table. The symbol created is a static symbol with a type of absolute (N_ABS). Some Fortran 77 compilers use this mechanism to communicate with the debugger.

Directives for Dead-Code Stripping

Dead-code stripping is the process by which the static link editor removes unused code and data blocks from executable files. This process helps reduce the overall size of executables, which in turn improves performance by reducing the memory footprint of the executable. It also allows programs to link successfully in the situation where unused code refers to an undefined symbol, something that would normally result in a link error. For more information on dead-code stripping, see Linking in Xcode 2.1 User Guide.

The following sections describe the dead-code stripping directives.

.subsections_via_symbols

SYNOPSIS

.subsections_via_symbols

The .subsections_via_symbols directive tells the static link editor that the sections of the object file can be divided into individual blocks. These blocks are then stripped if they are not used by other code. This directive applies to all section declarations in the assembly file and should be placed outside any section declarations, as shown here:

.subsections_via_symbols
 
; Section declarations...

When using this directive, ensure that each symbol in the section is at the beginning of a block of code. Implicit dependencies between blocks of code may result in the removal of needed code from the executable. For example, the following section contains three symbols, but execution of the code at _plus_three ends at the blr statement at the bottom of the code block:

.text
 .globl _plus_three
 _plus_three:
 addi r3, r3, 1
 .globl _plus_two
 _plus_two:
 addi r3, r3, 1
 .globl _plus_one
 _plus_one:
 addi r3, r3, 1
 blr

If you use the .subsections_via_symbols directive on this code and _plus_two and _plus_three are not called by any other code, the static link editor would not add _plus_two and _plus_one to the executable. In that case, _plus_three would not return the correct value because part of its implementation would be missing. In addition, if _plus_one is dead-stripped, the program may crash when _plus_three is executed, as it would continue executing into the following block.

.no_dead_strip

SYNOPSIS

.no_dead_strip symbol_name

The .no_dead_strip directive tells the assembler that the symbol specified by symbol_name must not be dead-stripped. For example, the following code prevents _my_version_string from being dead-stripped:

.no_dead_strip _my_version_string
.cstring
_my_version_string:
.ascii "Version 1.1"

Miscellaneous Directives

This section describes additional directives that don’t fit into any of the previous sections.

.abort

SYNOPSIS

.abort  [  "abort_string" ]

The .abort directive causes the assembler to ignore further input and quit processing. No files are created. The directive could be used, for example, in a pipe-interconnected version of a compiler—the first major syntax error would cause the compiler to issue this directive, saving unnecessary work in assembling code that would have to be discarded anyway.

The optional abort_string is printed as part of the error message when the .abort directive is encountered.

EXAMPLE

#ifndef VAR
    .abort "You must define VAR to assemble this file."
#endif

.abs

SYNOPSIS

.abs  symbol_name , expression

This directive sets the value of symbol_name to 1 if expression is an absolute expression; otherwise, it sets the value to zero.

EXAMPLE

.macro var
.abs is_abs,$0
.if is_abs==1
.abort "must be absolute"
.endif
.endmacro

.dump and .load

SYNOPSIS

.dump filename
.load filename

These directives let you dump and load the absolute symbols and macro definitions for faster loading and faster assembly.

These work like this:

.include "big_file_1"
.include "big_file_2"
.include "big_file_3"
...
.include "big_file_N"
.dump    "symbols.dump"

The .dump directive writes out all the N_ABS symbols and macros. You can later use the .load directive to load all the N_ABS symbols and macros faster than you could with .include:

.load "symbols.dump"

One useful side effect of loading symbols this way is that they aren’t written out to the object file.

.file and .line

SYNOPSIS

.file  file_name
.line  line_number

The .file directive causes the assembler to report error messages as if it were processing the file file_name.

The .line directive causes the assembler to report error messages as if it were processing the line line_number. The next line after the .line directive is assumed to be line_number.

The assembler turns C preprocessor comments of the form:

# line_number file_name level

into:

.line line_number; .file file_name

EXAMPLE

.line 6
nop       | this is line 6

.if, .elseif, .else, and .endif

SYNOPSIS

.if expression
.elseif expression
.else
.endif

These directives are used to delimit blocks of code that are to be assembled conditionally, depending on the value of an expression. A block of conditional code may be nested within another block of conditional code. expression must be an absolute expression.

For each .if directive:

  • there must be a matching .endif

  • there may be as many intervening .elseif’s as desired

  • there may be no more than one intervening .else before the tailing .endif

Labels or multiple statements must not be placed on the same line as any of these directives; otherwise, statements including these directives are not recognized and produce errors or incorrect conditional assembly.

EXAMPLE

.if a==1
.long 1
.elseif a==2
.long 2
.else
.long 3
.endif

.include

SYNOPSIS

.include "filename"

The .include directive causes the named file to be included at the current point in the assembly. The -Idir option to the assembler specifies alternative paths to be used in searching for the file if it isn’t found in the current directory.

EXAMPLE

.include  "macros.h"

.machine

SYNOPSIS

.machine arch_type

The .machine directive specifies the target architecture of the assembly file. arch_type can be any architecture type you can specify in the -arch option of the assembler driver. See Assembler Options for more information.

.macro, .endmacro, .macros_on, and .macros_off

SYNOPSIS

.macro
.endmacro
.macros_on
.macros_off

These directives allow you to define simple macros (once a macro is defined, however, you can’t redefine it). For example:

.macro  var
instruction_1  $0,$1
instruction_2  $2
 . . .
instruction_N
.long $n
.endmacro

$d (where d is a single decimal digit, 0 through 9) represents each argument—there can be at most 10 arguments. $n is replaced by the actual number of arguments the macro is invoked with.

When you use a macro, arguments are separated by a comma (except inside matching parentheses—for example, xxx(1,3,4),yyy contains only two arguments). You could use the macro defined above as follows:

var  #0,@sp,4

This would be expanded to:

instruction_1 #0,@sp
instruction_2 4
 . . .
instruction_N
.long 3

The directives .macros_on and .macros_off allow macros to be written that override an instruction or directive while still using the instruction or directive. For example:

.macro .long
.macros_off
.long $0,$0
.macros_on
.endmacro

If you don’t specify an argument, the macro substitutes nothing (see .abs).

PowerPC-Specific Directives

The following directives are specific to the PowerPC architecture.

.flag_reg

SYNOPSIS

.flag_reg reg_number

This causes the uses of the reg_number general register to get flagged as warnings. This is intended for use in macros.

.greg

SYNOPSIS

.greg symbol_name, expression...

This directive sets symbol_name to 1 when expression is a general register or zero otherwise. It is intended for use in macros.

.no_ppc601

SYNOPSIS

This causes PowerPC 601 instructions to be flagged as errors. This is the same as if the -no_ppc601 option is specified.

.noflag_reg

SYNOPSIS

.noflag_reg reg_number

This turns off the flagging of the uses of the reg_number general register so they don’t get flagged as warnings. This is intended for use in macros.

Additional Processor-Specific Directives

The following processor-specific directives are synonyms for other standard directives described earlier in this chapter; although they are listed here for completeness, their use isn’t recommended. Wherever possible, you should use the standard directive instead.

The following are i386-specific directives:

i386 Directive

Standard Directive

.ffloat

.single

.dfloat

.double

.tfloat

[expression] ¨ 80-bit IEEE extended precision floating-point

.word

.short

.value

.short

.ident

(ignored)

.def

(ignored)

.optim

(ignored)

.version

(ignored)

.ln

(ignored)