Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

Byte-Swapping Strategies

The strategy for swapping bytes depends on the format of the data; there is no universal routine that can take care of all byte ordering differences. Any program that needs to swap data must know the data type, the source data endian order, and the host endian order.

This section lists byte-swapping strategies, organized alphabetically, for the following data:

Constants

Constants that are part of a compiled executable are in host byte order. You need to swap bytes for a constant only if it is part of data that is not maintained natively or if the constant travels between hosts. In most cases you can either swap bytes ahead of time or let the preprocessor perform any needed math by using shifts or other simple operators.

If you are defining and populating a structure that must use data of a specific endian format in memory, use the OSSwapConst macros and the OSSwap*Const variants defined in the libkern/OSByteOrder.h header file. These macros can be used from high-level applications.

Custom Apple Event Data

An Apple event is a high-level event that conforms to the Apple Event Interprocess Messaging Protocol. The Apple Event Manager sends Apple events between applications on the same computer or between applications on remote computers. You can define your own Apple event data types, and send and receive Apple events using the Apple Event Manager API.

Mac OS X manages system-defined Apple event data types for you, handling them appropriately for the currently executing code. You don't need to perform any special tasks. When the data that your application extracts from an Apple event is system-defined, the system swaps the data for you before giving the event to your application to process. You will want to treat system-defined data types from Apple events as native endian. Similarly, if you put native-endian data into an Apple event that you are sending, and it is a system-defined data type, the receiver will be able to interpret the data in its own native endian format.

However, you must account for byte-ordering differences for the custom Apple event data types that you define. You can accomplish this in one of the following ways:

Custom Resource Data

In Mac OS X, the preferred way to supply resources is to provide files in your application bundle that define resources such as image files, sounds, localized text, and archived user-interface definitions. The resource data types discussed in this section are those defined in Resource Manager-style files supported by Carbon. The Resource Manager was created prior to Mac OS X. If your application uses Resource Manager-style resource files, you should consider moving towards Mac OS X–style resources in your application bundle instead.

Resources typically include data that describes menus, windows, controls, dialogs, sounds, fonts, and icons. Although the system defines a number of standard resource types (such as 'moov', used to specify a QuickTime movie, and 'MENU', used to define menus) you can also create your own private resource types for use in your application. You use the Resource Manager API to define resource data types and to get and set resource data.

Mac OS X keeps track of resources in memory and allows your application to read or write resources. Applications and system software interpret the data for a resource according to its resource type. Although you'll typically let the operating system read resources (such as your application icon) for you, you can also call Resource Manager functions directly to read and write resources.

Mac OS X manages the system-defined resources for you, handling them appropriately for the currently executing code. That is, if your application runs on an Intel-based Macintosh, Mac OS X swaps bytes so that your application icon, menus, and other standard resources appear correctly. You don't need to perform any special tasks. But if you define your own private resource data types for use in your application, you need to account for byte-ordering differences between architectures when you read or write resource data from disk.

You can use either of the following strategies to handle custom Resource Manager-style resource data. Notice that these are the same strategies used to handle custom Apple event data:

Note: If you are revising old code that marks resources with a preload bit, you should remove the preload bit from any resources that must be byte swapped. In Mac OS X, the preload bit is almost always unnecessary. If you cannot remove the preload bit, you should swap the resource data after you read the resource. You will not be able to use a flipper callback to swap bytes automatically because in Mac OS X a preload bit causes the resources to be read before any of the application code runs.

Floating-Point Values

Core Foundation defines a set of functions and two special data types to help you work with floating-point values. These functions allow you to encode 32- and 64-bit floating-point values in such a way that they can later be decoded and byte swapped if necessary. Listing 3-2 shows you how to encode a 64-bit floating-point number and Listing 3-3 shows how to decode it.

Listing 3-2  Encoding a 64-bit floating-point value

double d = 3.0;
CFSwappedFloat64 swappedDouble;
// Encode the floating-point value.
swappedDouble = CFConvertFloat64HostToSwapped(d);
// Call the appropriate routine to write swappedDouble to disk,
// send it to another process, etc.
write(myFile, &swappedDouble, sizeof(swappedDouble));

The data types CFSwappedFloat32 and CFSwappedFloat64 contain floating-point values in a canonical representation. A CFSwappedFloat data type is not itself a floating-point value, and should not be directly used as one. You can however send one to another process, save it to disk, or send it over a network. Because the format is converted to and from the canonical format by the conversion functions, there is no need for explicit swapping. Bytes are swapped for you during the format conversion if necessary.

Listing 3-3  Decoding a 32-bit floating-point value

float f;
CFSwappedFloat32 swappedFloat;
// Call the appropriate routine to read swappedFloat from disk,
// receive it from another process, etc.
read(myFile, &swappedFloat, sizeof(swappedFloat));
f = CFConvertFloat32SwappedToHost(swappedFloat)

The NSByteOrder.h header file defines functions that are comparable to the Core Foundation functions discussed here.

Integers

The system library byte-access functions, such as OSReadLittleInt16 and OSWriteLittleInt16, provide generic byte swapping. These functions swap bytes if the native endian format is different from the endian format of the destination. They are defined in the libkern/OSByteOrder.h header file.

Note:  The OSReadXXX and OSWriteXXX functions provide higher performance than the OSSwapXXX functions or any other functions in the higher-level frameworks.

Core Foundation provides three optimized primitive functions for swapping bytes— CFSwapInt16, CFSwapInt32, and CFSwapInt64. All of the other swapping functions use these primitives to accomplish their work. In general you don’t need to use these primitives directly.

Although the primitive swapping functions swap unconditionally, the higher-level swapping functions are defined in such a way that they do nothing when swapping bytes is not required—in other words, when the source and host byte orders are the same. For the integer types, these functions take the forms CFSwapXXXBigToHost, CFSwapXXXLittleToHost, CFSwapXXXHostToBig, and CFSwapXXXHostToLittle, where XXX is a data type such as Int32. For example, on a little-endian machine you use the function CFSwapInt16BigToHost to read a 16-bit integer value from a network whose data is in network byte order (big-endian). Listing 3-4 demonstrates this process.

Listing 3-4  Swapping a 16-bit integer from big-endian to host-endian

SInt16  bigEndian16;
SInt16  swapped16;
// Swap a 16-bit value read from the network.
swapped16 = CFSwapInt16BigToHost(bigEndian16);

Suppose the integers are in the fields of a data structure. Listing 3-5 demonstrates how to swap bytes.

Listing 3-5  Swapping integers from little-endian to host-endian

// Swap the bytes of the values if necessary.
aStruct.int1 = CFSwapInt32LittleToHost(aStruct.int1)
aStruct.int2 = CFSwapInt32LittleToHost(aStruct.int2)

The code swaps bytes only if necessary. If the host is a big-endian architecture, the functions used in the code sample swap the bytes in each field. The code does nothing when run on a little-endian machine—the compiler ignores the code.

Network-Related Data

Network-related data typically uses big-endian format (also known as network byte order), so you may need to swap bytes when communicating between the network and an Intel-based Macintosh computer. You probably never had to adjust your PowerPC code when you transmitted data to, or received data from, the network. On an Intel-based Macintosh computer you must look closely at your networking code and ensure that you always send network-related data in the appropriate byte order. You must also handle data received from the network appropriately, swapping the bytes of values to the endian format appropriate to the host microprocessor.

You can use the following POSIX functions to convert between network byte order and host byte order. (Other byte-swapping functions, such as those defined in the OSByteOrder.h and CFByteOrder.h header files, can also be useful for handling network data.)

These functions are documented in Mac OS X Man Pages.

The sin_saddr.s_addr and sin_port fields of a sockaddr_in structure should always be in network byte order. You can find out the appropriate endian format of any argument to a BSD networking function by reading the man page documentation.

When advertising a service on the network, you use getsockname to get the local TCP or UDP port that your socket is bound to, and then pass my_sockaddr.sin_port unchanged, without any byte swapping, to the DNSServiceRegister function.

In CoreFoundation code, you can use the same approach. Use the CFSocketCopyAddress function as shown below, and then pass my_sockaddr.sin_port unchanged, without any byte swapping, to the DNSServiceRegister function.

CFDataRef addr = CFSocketCopyAddress(myCFSocketRef);
struct sockaddr_in my_sockaddr;
memmove(&my_sockaddr, CFDataGetBytePtr(addr), sizeof(my_sockaddr));
DNSServiceRegister( ... , my_sockaddr.sin_port, ...);

When browsing and resolving, the process is similar. The DNSServiceResolve function and the BSD Sockets calls such as gethostbyname and getaddrinfo all return IP addresses and ports already in the correct byte order so that you can assign them directly to your struct sockaddr_in and call connect to open a TCP connection. If you byte-swap the address or port, then your program will not work.

The important point is that when you use the DNSServiceDiscovery API with the BSD Sockets networking APIs, you do not need to swap anything. Your code will work correctly on both PowerPC and Intel-based Macintosh computers as well as on Linux, Solaris, and Windows.

OSType-to-String Conversions

You can use the functions UTCreateStringForOSType and UTGetOSTypeFromString to convert an OSType data type to or from a CFString object (CFStringRef data type). These functions are discussed in Uniform Type Identifiers Overview and defined in the UTType.h header file, which is part of the Launch Services framework.

When you use four-character literals, keep in mind that "abcd" != 'abcd'. Rather 'abcd' == 0x61626364. You must treat 'abcd' as an integer and not string data, as 'abcd' is a shortcut for a 32-bit integer. (A FourCharCode data type is a UInt32 data type.) The compiler does not swap this for you. You can use the shift operator if you need to deal with individual characters.

For example, if you currently print an OSType or FourCharCode type using the standard C printf-style semantics, use

printf("%c%c%c%c", (char) (val >> 24), (char) (val  >> 16),
                    (char) (val >> 8), (char) val)

instead of the following:

printf("%4.4s", (const char*) &val)

Unicode Text Files

Mac OS X often uses UTF-16 to encode Unicode; a UniChar data type is a double-byte value. As with any multibyte data, Unicode characters are sensitive to the byte ordering method used by the microprocessor. A byte order mark written to the beginning of a file informs the program reading the data which byte ordering method was used to write the data. The Unicode standard states that in the absence of a byte order mark (BOM) the data in a Unicode data file is to be taken as big-endian. Although a BOM is not mandatory, you should make use of it to ensure that a file written on one architecture can be read from the other architecture. The program can then act accordingly to make sure the byte ordering of the Unicode text is compatible with the host.

Table 3-1 lists the standard byte order marks for UTF-8, UTF-16, and UTF-32. (Note that the UTF-8 BOM is not used for endian issues, but only as a tag to indicate that the file is UTF-8.)

Table 3-1  Byte order marks

Byte order mark

Encoding form

EF BB BF

UTF-8

FF FE

UTF-16/UCS-2, little endian

FE FF

UTF-16/UCS-2, big endian

FF FE 00 00

UTF-32/UCS-4, little endian

00 00 FE FF

UTF-32/UCS-4, big endian

In practice, when your application reads a file, it does not need to look for a byte order mark nor does it need to swap bytes as long as you follow these steps to read a file:

  1. Map the file using mmap to get a pointer to the contents of the file (or string).

    Reading the entire file into memory ensures the best performance and is a prerequisite for the next step.

  2. Generate a CFString object by calling the function CFStringCreateWithBytes with the isExternalRepresentation parameter set to true, or call the function CFStringCreateWithExternalRepresentation to generate a CFString, passing in an encoding of kCFStringEncodingUnicode (for UTF-16) or kCFStringEncodingUTF8 (for UTF-8).

    Either function interprets a BOM swaps bytes as necessary. Note that a BOM should not be used in memory; its use is solely for data transmission (files, pasteboard, and so forth).

In summary, with respect to Unicode files, your application performs best when you follow these guidelines:

For more information, see “UTF & BOM,” available from the Unicode website:

http://www.unicode.org/faq/utf_bom.html

The Apple Event Manager provides text constants that you can use to specify the type of your data. As of Mac OS X v10.4, only two text constants are recommended:

The constant typeUnicodeText indicates utxt text data, in native byte ordering format, with an optional BOM. This constant does not specify an explicit Unicode encoding or byte order definition.

The Scrap Manager provides the flavor type constant kScrapFlavorTypeUTF16External which specifies Unicode text in 16-bit external representation with optional byte order mark (BOM).

Values in an Array

The routine in Listing 3-6 shows an approach that you can use to swap the bytes of values in an array. On a big-endian system, the compiler optimizes away the entire function; you don’t need to use #ifdef statements to swap these sorts of arrays.

Listing 3-6  A routine for swapping the bytes of the values in an array

static inline void SwapUInt32ArrayBigToHost(UInt32 *array, UInt32  count) {
    int i;
 
    for(i = 0; i < count; i++) {
        array[i] = CFSwapInt32BigToHost(array[i]);
    }
}


< Previous PageNext Page > Hide TOC


Last updated: 2007-02-26




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice