Extracting Metadata from Files

Spotlight provides a powerful search capability by providing an application the means to save metadata about the file’s content. This metadata is searchable from OS X on disk based storage—both local and network.

For Spotlight searching to be possible, it has to have access to the file metadata. Although some file-system metadata (modification dates, display name, path name) is available for all files, most of the interesting data is embedded inside the file. To gather this embedded information into a searchable format, you must provide a Spotlight importer.

What Is a Spotlight Importer?

A Spotlight importer is a small plug-in bundle that you create to extract information from files created by your application.

Spotlight importers parse your file format for relevant information and assign that information to the appropriate metadata keys. Metadata keys provided by Apple (see File Metadata Attributes Reference) index the content in the file and facilitate searches using standard metadata keys. Xcode includes a Spotlight project template that provides the required CFPlugin support, as well as templates for the required schema file.

Spotlight importers typically are within your application’s bundle in the subdirectory MyApp.app/Contents/Library/Spotlight. Importers not related to a specific application can also be installed in ~/Library/Spotlight, /Library/Spotlight, and Framework/PlugIn. Apple-provided importers reside in /System/Library/Spotlight.

Associating a Spotlight Importer with Files

Spotlight importers are associated with file types by specifying the uniform type identifiers (UTIs) from which they extract data. For more information on Uniform Type Identifiers see Uniform Type Identifiers Overview.

The supported UTI types are specified in the importer’s Info.plist file, contained within the plug-in bundle. An importer can support a single file type or multiple file types. The function in the importer that is called for each file is passed the UTI type of the file and can adjust its extraction means as appropriate.

Guidelines

All critical metadata should be in the extracted from the data file. Consider the System store of metadata should be considered volatile.

Having to create intermediary cache files which were then processed by the Spotlight importer was a common work with Core Data-based applications, but it has been solved by Spotlight support for Core Data documents as described in Core Data Spotlight Integration Programming Guide.

A Spotlight importer must run entirely without interaction. You should not attempt to present any user interface or expect that the window server is running.

You should not expect your application to be running when your metadata importer is called. Importers can be called at any time to extract metadata from a file. Your metadata importer should be able to extract the information without any assistance from the application that created the file.

Keep security in mind when considering the metadata to write to a file. For example, writing a user’s account name or password to a metadata field (even a search-only field) is a bad idea.