ADC Home > Reference Library > Technical Notes > Legacy Documents > Apple Applications >
Legacy Document
Important: This document is part of the Legacy section of the ADC Reference Library. This information should not be used for new development.
Current information on this Reference Library topic can be found here:
|
OverviewText Extractors improve the accuracy of indexing and summarizing files. As an example, consider the HTML file shown in Listing 1.
Listing 1. A sample HTML file. Without knowing the HTML format, every word above would get indexed, so searching for "body" in Find by Content would find the above document, but when a user opened the file in her web browser, she would not see "body". Similarly, summarizing a HTML document would show HTML Tags in the summary. The HTML Text Extractor knows the format of an HTML file so it will skip the HTML Tags and return just the text that a user would see viewing the document. For the above example, "This is a sample document." would be the only text that is indexed. Text Extractor Plug-ins DefinedText Extractor Plug-ins are Code Fragments that have the following characteristics:
A Text Extractor Plug-in's resource file may contain one or more
Registering the MIME Types a Plug-in can UnderstandClients of Text Extractors need to map documents to a MIME
type. To
help clients determine the document types a plug-in understands,
a plug-in can include one or more
Listing 2. A sample When creating indexes, Find By Content uses calls to Internet
Config
to discover the file's MIME type. Once a file's MIME type has been
discovered, it then uses the a Text Extractor Plug-in capable of
extracting
text from the file (based on the MIME types the extractor
advertises it
can decode in its MIME types reported by Text Extractor Plug-ins must be of the format type "/" subtype otherwise the extractor will be ignored. Also, an extractor's initialization function should verify (and correct, if necessary) that any entries in Internet Config's file mapping database referencing the extractor's type and creator specify the same mime type as the Text Extractor Plug-in. Structures Used By Plug-insFind By Content provides a number of routines and callbacks that can be used by Text Extractor Plug-ins. These callbacks provide access to memory allocation and file input. The following sections describe the structures used by Find By Content to provide these callbacks and the callbacks themselves. Application developers wanting to call Text Extractor Plug-ins from their own code will want to create and initialize these structures themselves. Examples of how to do this can be found later in the Calling a Text Extractor Plug-in from an Application section below. The IAPluginInitBlock StructureThe
Listing 3. Declaration of the
result - a pointer to a block of storage or
Applications developers wanting to call Text Extractor Plug-ins from
inside of their own applications will have to initialize this structure
and define the necessary callbacks themselves. An example showing how to
set up a The
|
/* IADocAccessorRecord structure definition. */ typedef struct IADocAccessorRecord* IADocAccessorPtr; struct IADocAccessorRecord { /* docAccessor is an opaque type used by Find By Content to track the file. It is not possible for plug-ins to access this information. */ IADocAccessorRef docAccessor; IADocAccessorOpenUPP OpenDoc; IADocAccessorCloseUPP CloseDoc; IADocAccessorReadUPP ReadDoc; IASetDocAccessorReadPositionUPP SetReadPosition; IAGetDocAccessorReadPositionUPP GetReadPosition; IAGetDocAccessorEOFUPP GetEOF; }; typedef struct IADocAccessorRecord IADocAccessorRecord; /* Routine Prototypes. */ OSStatus CallIADocumentAccessorOpen(IADocAccessorRef inAccessor); OSStatus CallIADocumentAccessorClose(IADocAccessorRef inAccessor); OSStatus CallIADocumentAccessorRead(IADocAccessorRef inAccessor, void* buffer, UInt32* ioSize); OSStatus CallIASetDocumentAccessorReadPosition(IADocAccessorRef inAccessor, SInt32 inMode, SInt32 inOffset); OSStatus CallIAGetDocumentAccessorReadPosition(IADocAccessorRef inAccessor, SInt32* outPostion); OSStatus CallIAGetDocumentAccessorEOF(IADocAccessorRef inAccessor, SInt32* outEOF); /* macros corresponding to the routine prototypes above */ #define CallIADocumentAccessorOpen(accessor) \ InvokeIADocAccessorOpenUPP((accessor)->docAccessor, \ (accessor)->OpenDoc) #define CallIADocumentAccessorClose(accessor) \ InvokeIADocAccessorCloseUPP((accessor)->docAccessor,\ (accessor)->CloseDoc) #define CallIADocumentAccessorRead(accessor, buffer, size) \ InvokeIADocAccessorReadUPP((accessor)->docAccessor, (buffer),\ (size), (accessor)->ReadDoc) #define CallIASetDocumentAccessorReadPosition(accessor, mode, offset) \ InvokeIASetDocAccessorReadPositionUPP((accessor)->docAccessor,\ (mode), (offset), (accessor)->SetReadPosition) #define CallIAGetDocumentAccessorReadPosition(accessor,\ outPosition) \ InvokeIAGetDocAccessorReadPositionUPP((accessor)->docAccessor,\ (outPosition), (accessor)->GetReadPosition) #define CallIAGetDocumentAccessorEOF(accessor, outEOF) \ InvokeIAGetDocAccessorEOFUPP((accessor)->docAccessor, \ (outEOF), accessor)->GetEOF) |
Listing 4. Declaration of the
IADocAccessorRecord
structure and prototypes that can be used
for calling the routines referenced in the structure.
The IADocAccessorRecord
defined in Listing 4 provides
plug-ins with all the necessary resources for accessing files. Plug-ins
should not
make calls to the File Manager directly. Instead, they should perform
all file input operations necessary for accessing a file through
these callbacks. Fields and callbacks defined in this structure
are discussed below.
CallIADocumentAccessorOpen
OSStatus CallIADocumentAccessorOpen( IADocAccessorRef inAccessor); |
inAccessor
- a pointer to the
IADocAccessorRecord
passed to the IAOpenDocument
routine.
result - errIANoErr
if the operation
was successful,
some other error code if the operation failed.
CallIADocumentAccessorOpen
is a callback procedure provided
in the IADocAccessorRecord
structure that can be called by plug-ins to open a file for input.
CallIADocumentAccessorOpen
opens the document
for reading. Plug-ins should call this routine to open
the document for reading before making any of the input
calls described below.
CallIADocumentAccessorClose
OSStatus CallIADocumentAccessorClose( IADocAccessorRef inAccessor); |
inAccessor
- a pointer to the
IADocAccessorRecord
passed to the IAOpenDocument
routine.
inAccessor
must be
in the open state when this routine is called.
result - errIANoErr
if the operation
was successful,
some other error code if the operation failed.
CallIADocumentAccessorClose
is a callback procedure provided
in the IADocAccessorRecord
structure that can be called by plug-ins to close a file that
was opened by a call to CallIADocumentAccessorOpen
.
CallIADocumentAccessorClose
should be called to
close a file opened by a call to CallIADocumentAccessorOpen
.
CallIADocumentAccessorRead
OSStatus CallIADocumentAccessorRead( IADocAccessorRef inAccessor, void* buffer, UInt32* ioSize); |
inAccessor
- a pointer to the IADocAccessorRecord
passed to the IAOpenDocument
routine. inAccessor
must be in the open state when this routine is called.
buffer
- a pointer to a buffer where the data should be
stored.
ioSize
- a pointer to a 32-bit integer containing the number of
bytes to be read. When the routine returns, this value will have been updated
to the actual number of bytes read.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
CallIADocumentAccessorRead
is a callback procedure provided
in the IADocAccessorRecord
structure that can be called by plug-ins to read data from a file.
CallIADocumentAccessorRead
reads *ioSize
bytes
from the file starting at the current read file position. On return,
*ioSize
will reflect the actual number of bytes read and
the routine's result will indicate the success of the call. If
this callback returns an eofErr
error, be sure to check
the value stored in *ioSize
as it is possible that some
bytes may have been read into the buffer before the end of the
file was encountered. Calls to CallIADocumentAccessorRead
advance the read position for the file past the bytes that have
been read - the next call to CallIADocumentAccessorRead
begins where the last one left off.
CallIASetDocumentAccessorReadPosition
OSStatus CallIASetDocumentAccessorReadPosition( IADocAccessorRef inAccessor, SInt32 inMode, SInt32 inOffset); |
inAccessor
- a pointer to the IADocAccessorRecord
passed to the IAOpenDocument
routine. inAccessor
must be in the open state when this routine is called.
inMode
- contains one of the following positioning
constants:
kIAFromStartMode
-inOffset
contains a value to be interpreted as an offset from the start of the file.kIAFromCurrMode
-inOffset
contains a value to be interpreted as an offset the current read position.kIAFromEndMode
-inOffset
contains a value to be interpreted as an offset from the end of the file.
inOffset
- contains a 32-bit signed integer used to offset
the current read position.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
CallIASetDocumentAccessorReadPosition
is a callback
procedure provided in the IADocAccessorRecord
structure that can be called by plug-ins to set the position
where the next read will take place when
CallIADocumentAccessorRead
is called.
CallIASetDocumentAccessorReadPosition
can be used
to set the position where the next call to
CallIADocumentAccessorRead
will begin reading bytes from the
file. When a file is first opened, its read position is set to the
beginning of the file.
CallIAGetDocumentAccessorReadPosition
OSStatus CallIAGetDocumentAccessorReadPosition( IADocAccessorRef inAccessor, SInt32* outPostion); |
inAccessor
- a pointer to the IADocAccessorRecord
passed to the IAOpenDocument
routine. inAccessor
must be in the open state when this routine is called.
outPostion
- a pointer to a 32-bit value that is set to
the current read position's offset from the beginning of the file.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
CallIAGetDocumentAccessorReadPosition
is a callback
procedure provided in the IADocAccessorRecord
structure that can be called by plug-ins to determine the position where
the next read will take place when CallIADocumentAccessorRead
is called.
CallIAGetDocumentAccessorReadPosition
returns the
location where the next read operation will take place in
*outPostion
. The value returned is an offset from
the beginning of the file.
CallIAGetDocumentAccessorEOF
OSStatus CallIAGetDocumentAccessorEOF( IADocAccessorRef inAccessor, SInt32* outEOF); |
inAccessor
- a pointer to the IADocAccessorRecord
passed to the IAOpenDocument
routine. inAccessor
must be in the open state when this routine is called.
outEOF
- a pointer to a 32-bit value that is set to
the number of bytes in the file.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
CallIAGetDocumentAccessorReadPosition
is a callback
procedure provided in the IADocAccessorRecord
structure that can be called by plug-ins to determine length of the input
file.
CallIAGetDocumentAccessorEOF
can be used to discover
the length of a file. On return, *outEOF
is set to the total
number of bytes in the file.
Applications developers wanting to call Text Extractor Plug-ins from
inside of their own applications will have to initialize this structure
and define the necessary callbacks themselves. An example showing how
to set up a IADocAccessorRecord
structure can be found in
the Setting up the
IADocAccessorRecord
structure section later in this
document.
Routines a Text Extractor Must Define
This section describes the routines that must be exported by all Text Extractor Plug-ins. This section provides a detailed description of each routine along with some discussion any important issues related to each routine.
IAPluginInit
OSStatus IAPluginInit( IAPluginInitBlockPtr initBlock, IAPluginRef *outPluginRef); |
initBlock
- contains a pointer to a
IAPluginInitBlock
structure.
outPluginRef
- is a pointer to a 32-bit value that will
be passed to other plug-in routines while the plug-in is open. A
plug-in may set this value in its IAPluginInit
routine
and it will remain unchanged until IAPluginTerm
is called.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAPluginInit
is a routine that must be provided
in the plug-in's code fragment.
After the plug-in's code fragment has been prepared for execution,
the plug-in's IAPluginInit
routine is called. This routine
provides an opportunity for a plug-in to perform any necessary initialization
operations it may require.
The callbacks in the
IAPluginInitBlock
pointed to by the
initBlock
parameter remain valid while the plug-in is open (until
IAPluginTerm
is called)
and may be called from any of the plug-in's other routines. The value
stored in *outPluginRef
is dedicated for the plug-in's use and
may be used to store persistent state information that is to remain intact
between calls to the plug-in (this value is not saved after the plug-in
has been closed).
For an example illustrating how this routine could be implemented refer to Listing 6.
IAPluginTerm
OSStatus IAPluginTerm(IAPluginRef inPluginRef); |
inPluginRef
- a 32-bit value dedicated for the plug-in's
use. This value will be the same as the value the *outPluginRef
parameter was set to in the IAPluginInit
call.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAPluginTerm
is a routine that must be provided
in the plug-in's code fragment.
Before a plug-in's Code Fragment Manager connection is closed, the
plug-in's IAPluginTerm
routine is called. This routine
provides opportunity for the plug-in to perform any necessary
cleanup operations required such as deallocating storage, closing
resource files, et cetera. After this routine has been called, there
will be no other calls made to the plug-in until the next time it
is opened by a call to IAPluginInit
.
For an example illustrating how this routine could be implemented refer to Listing 7.
IAGetExtractorVersion
OSStatus IAGetExtractorVersion( IAPluginRef inPluginRef, UInt32 outPluginVersion); |
inPluginRef
- a 32-bit value dedicated for the plug-in's
use. This value will be the same as the value the *outPluginRef
parameter was set to in the IAPluginInit
call.
outPluginVersion
- a pointer to a 32-bit value. Your
routine should set this value to kIAExtractorCurrentVersion
.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAGetExtractorVersion
is a routine that must be provided
in the plug-in's code fragment.
In this routine, a plug-in should set the value
*outPluginVersion
to the version of the Text Extractor Plug-in interface it was compiled against.
The constant kIAExtractorCurrentVersion
, defined in
"IAExtractor.h",
contains the current version of the Text Extractor Plug-in interface.
For an example illustrating how this routine could be implemented refer to Listing 8.
IACountSupportedDocTypes
OSStatus IACountSupportedDocTypes( IAPluginRef inPluginRef, UInt32* outCount); |
inPluginRef
- a 32-bit value dedicated for the plug-in's
use. This value will be the same as the value the
*outPluginRef
parameter was set to in the IAPluginInit
call.
outCount
- a pointer to a 32-bit integer. The plug-in should
set this integer to the number of document types that it knows how to
handle.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IACountSupportedDocTypes
is a routine that must be provided
in the plug-in's code fragment.
This routine should set *outCount
to the number of document
types the plug-in is able to handle. The value stored in
*outCount
is interpreted as the maximum valid index that can
be provided as an index in
IAGetIndSupportedDocType
calls.
For an example illustrating how this routine could be implemented refer to Listing 9.
IAGetIndSupportedDocType
OSStatus IAGetIndSupportedDocType( IAPluginRef inPluginRef, UInt32 inIndex, char** outMIMEType); |
inPluginRef
- a 32-bit value dedicated for the plug-in's
use. This value will be the same as the value the *outPluginRef
parameter was set to in the IAPluginInit
call.
inIndex
- a 32-bit integer value indicating the index
of the document type to return. Index values range between 1 and
the maximum index value returned by IACountSupportedDocTypes
.
*outMIMEType
- a pointer value of type char*
.
A plug-in should set this value to point to a string
containing the MIME type string. The storage for this string
belongs to the plug-in - if it was allocated by the plug-in,
then the plug-in must deallocate it.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAGetIndSupportedDocType
is a routine that must be provided
in the plug-in's code fragment.
The routine IAGetIndSupportedDocType
sets
*outMIMEType
to point to a string containing
the nth MIME type the plug-in is able to understand. Index
values that may be provided in the inIndex
parameter
range from 1 (not zero) through the maximum value as reported
by the IACountSupportedDocTypes
call.
For an example illustrating how this routine could be implemented refer to Listing 10.
IAOpenDocument
OSStatus IAOpenDocument( IAPluginRef inPluginRef, IADocAccessorPtr inAccessor, IADocRef* outDoc); |
inPluginRef
- a 32-bit value dedicated for the plug-in's
use. This value will be the same as the value the *outPluginRef
parameter was set to in the IAPluginInit
call.
inAccessor
- a pointer to a IAPluginInitBlock
containing callbacks necessary for reading information from a file.
outDoc
- a pointer to a 32-bit value available
for the plug-in to use for storing information specific to the document.
Normally plug-ins will store a pointer to necessary state
variables specific to the document in this parameter.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAOpenDocument
is a routine that must be provided
in the plug-in's code fragment.
IAOpenDocument
is called before a plug-in is used to extract
text from a new document. This routine provides opportunity for the
plug-in to perform any initialization operations required before
it begins reading text from a file. Any state variables or data
buffers required for processing the file should be stored in a
block of memory and a pointer to that block should be stored in
*outDoc
. This value will be passed to the routines
IAGetNextTextRun
,
and IAGetTextRunInfo
while
the document is open, and then to IACloseDocument
once all the required text has been extracted from the document.
Both the IAPluginInitBlock
pointed to by the inAccessor
parameter and the value stored in
*outDoc
will remain valid until
IACloseDocument
is called.
For an example illustrating how this routine could be implemented refer to Listing 11.
IACloseDocument
OSStatus IACloseDocument( IADocRef inDoc); |
inDoc
- The document reference value created
by the plug-in the IAOpenDocument
call containing state variables or data buffers required for processing the
file.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IACloseDocument
is a routine that must be provided
in the plug-in's code fragment.
IACloseDocument
is called after all textual information
required
has been extracted from the document. In this call, the plug-in should
dispose of any state variables or buffers that were created specifically
for the file referenced by the inDoc
parameter.
For an example illustrating how this routine could be implemented refer to Listing 12.
IAGetNextTextRun
OSStatus IAGetNextTextRun( IADocRef inDoc, void* buffer, UInt32* ioSize); |
inDoc
- The document reference value created
by the plug-in the IAOpenDocument
call containing state variables or data buffers required for processing the
file.
buffer
- a pointer to a block of memory.
ioSize
- a pointer to a 32-bit integer value. when the
routine is called, this value will equal the number of bytes available
in the memory area pointed to by buffer
parameter. After
copying some text to this memory buffer, the plug-in should set this
value to the actual number of bytes copied.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAGetNextTextRun
is a routine that must be provided
in the plug-in's code fragment.
The IAGetNextTextRun
routine should copy text from the document
into the memory buffer pointed to by the buffer
parameter
until that buffer is full, or the plug-in runs out of text. If the
language encoding changes from one language to another while text
is being decoded, the plug-in mark that location in the text stream
by returning the result code errIAEndOfTextRun
.
When the plug-in reaches the end of the text in the file, it should
return a result code of noErr
and it should set
*ioSize
to zero indicating there is no more text to be read from the file.
For an example illustrating how this routine could be implemented refer to Listing 13.
IAGetTextRunInfo
OSStatus IAGetTextRunInfo( IADocRef inDoc, char** outEncoding, char** outLanguage); |
inDoc
- The document reference value created
by the plug-in the IAOpenDocument
call containing state variables or data buffers required for processing the
file.
outEncoding
- a pointer to a variable to type
char*. This is an optional parameter, and may be set to NULL
if the caller is not interested in this value. The plug-in should
store a pointer to a string in the variable pointed to by this
parameter that contains the Internet name for the current
character encoding for text being extracted from the file.
outLanguage
- a pointer to a variable to type
char*. This is an optional parameter, and may be set to NULL
if the caller is not interested in this value. The plug-in should
store a pointer to a string in the variable pointed to by this
parameter that contains the language name for text being extracted
from the file. The language corresponds to the internet standard
defined in ISO-639.
result - errIANoErr
if the operation was successful,
some other error code if the operation failed.
IAGetTextRunInfo
is a routine that must be provided
in the plug-in's code fragment.
IAGetTextRunInfo
returns information about the character
encoding and the language of the text for the last buffer returned by
IAGetNextTextRun
.
Both parameters are optional and may or may not be present
depending on the caller's requirements. If a parameter is not
required, then it will be set to NULL
.
If the plug-in allocates a pointer to a string and stores
that pointer either in *outEncoding
or in
*outLanguage
, then it is the plug-in's responsibility to
deallocate that storage.
If either value is not known, the plug-in may store the value
NULL
in either *outEncoding
or in
*outLanguage
.
This value instructs the caller that the current character encoding or
language is not known by the plug-in.
A pointer to a string containing the Internet name for the
character encoding is returned in the *outEncoding
parameter.
Encoding is the internet name for an encoding (i.e., "iso-8859-1",
"x-mac-roman", "euc-jp", ...).
For an example illustrating how this routine could be implemented refer to Listing 14.
An Example Plug-in
The following annotated example illustrates how to create a Text Extractor Plug-in for the "text/plain" MIME type. As the function of this plug-in is to pass text from the file to the caller, its implementation is very simple. Developers can easily modify this example to extract text from their own file formats.
/* File: PlainTextExtractor.c Text Extractor plug-in example/shell. */ /* The file IAExtractor.h contains defines and structures necessary for creating a Text Extractor Plug-in. */ #include "IAExtractor.h" /* This constant is used in the example as a data value stored in the reference value maintained by the caller for the plug-in. It's not necessary to create a plug-in, but it's useful for illustration. */ enum { kPlainTextExtractorRefType = 'text' }; /* This macro is used or verifying the reference value remains unchanged in the example. */ #define VerifyType(x) ((UInt32)(x)==(UInt32)kPlainTextExtractorRefType) |
Listing 5. File header & imports for Text Extractor Plug-ins.
The only important aspect of the above is the header file being included. Here, the file "IAExtractor.h" containing the necessary constant and structure definitions is included.
/* IAPluginInit example implementation.*/ OSStatus IAPluginInit( IAPluginInitBlockPtr initBlock, IAPluginRef* outPluginRef) { /* validate parameters. */ if (outPluginRef == NULL) return errIAParamErr; /* initialize the reference value. Plug-ins that require memory allocation should cache initBlock info here. */ *outPluginRef = (IAPluginRef)kPlainTextExtractorRefType; /* Return with no error. */ return errIANoErr; } |
Listing 6. IAPluginInit
example.
The IAPluginInit
is
the first call
made to the plug-in. During this call, the plug-in should set
up any variables or tables required. Also, if the plug-in will
require
any of the callbacks found in the IAPluginInitBlock
pointed to by the initBlock
parameter later during
its execution,
then it should save a copy of this pointer.
/* IAPluginTerm example implementation.*/ OSStatus IAPluginTerm(IAPluginRef inPluginRef) { /* validate parameters */ if (!VerifyType(inPluginRef)) return errIAParamErr; /* do other tear-down operations here... */ .... /* Return with no error. */ return errIANoErr; } |
Listing 7. IAPluginTerm
example.
Normally, the IAPluginTerm
routine will
be used to deallocate any storage allocated by the plug-in, close any
resource files, and other cleanup tasks that need to be
performed.
/* IAGetExtractorVersion example implementation.*/ OSStatus IAGetExtractorVersion( IAPluginRef inPluginRef, UInt32* outPluginVersion) { /* validate parameters */ if (!VerifyType(inPluginRef) || !outPluginVersion ) return errIAParamErr; /* set return value to the interface version this code was compiled with. */ *outPluginVersion = kIAExtractorCurrentVersion; /* Return with no error. */ return errIANoErr; } |
Listing 8. IAGetExtractorVersion
example.
The value kIAExtractorCurrentVersion
will always
contain the current
version for the declarations included in the file
"IAExtractor.h"
. For
the current implementation this value is set to
kIAExtractorVersion1
.
/* IACountSupportedDocTypes example implementation.*/ OSStatus IACountSupportedDocTypes( IAPluginRef inPluginRef, UInt32* outCount) { /* validate parameters*/ if (!VerifyType(inPluginRef) || ! outCount) return errIAParamErr; /* count is max value to be passed to IAGetIndSupportedDocType as index */ *outCount = 1; /* Return with no error. */ return errIANoErr; } |
Listing 9. IACountSupportedDocTypes
example.
In this example, we only support one document type - plain text documents.
/* IAGetIndSupportedDocType example implementation.*/ OSStatus IAGetIndSupportedDocType( IAPluginRef inPluginRef, UInt32 inIndex, char **outMIMEType) { /* set up local variables */ static char* supportedDocType = "text/plain"; /* validate parameters */ if (!VerifyType(inPluginRef) || !outMIMEType || inIndex != 1) return errIAParamErr; /* set return value */ *outMIMEType = supportedDocType; /* return successfully */ return errIANoErr; } |
Listing 10. IAGetIndSupportedDocType
example.
In the above declaration of IAGetIndSupportedDocType
the MIME type string is stored as a static variable among the
plug-in's globals.
/* IAOpenDocument example implementation.*/ OSStatus IAOpenDocument( IAPluginRef inPluginRef, IADocAccessorPtr inDocAccessor, IADocRef* outDoc) { /* local variables */ OSStatus err; /* verify parameters */ if (!VerifyType(inPluginRef) || !inDocAccessor || !outDoc) return errIAParamErr; /* call our opening routine */ err = CallIADocumentAccessorOpen(inDocAccessor); if (err != errIANoErr) return err; /* IADocRef is defined by plug-in, in our case we are just reading directly from the accessor so we are defining the opaque type IADocRef to be an IADocAccessorPtr. */ *outDoc = (IADocRef)inDocAccessor; /* return successfully */ return errIANoErr; } |
Listing 11. IAOpenDocument
example.
In the IAOpenDocument
call shown above,
the
plug-in calls back through the IAPluginInitBlock
record pointed to by the inDocAccessor
parameter and
before
caching a copy of inDocAccessor
in the document
reference
parameter (*outDoc
). This value is used to refer to
the document
in the next few listings.
/* IACloseDocument example implementation.*/ OSStatus IACloseDocument(IADocRef inDoc) { /* local variables */ IADocAccessorPtr docAccessor; OSStatus err; /* verify parameters */ if (inDoc == NULL) return errIAParamErr; /* Cast IADocRef to what we defined it to be in IAOpenDocument in this case a IADocAccessorPtr */ docAccessor = (IADocAccessorPtr)inDoc; /* use the callback to close the file */ err = CallIADocumentAccessorClose(docAc cessor); /* return status of last close */ return err; } |
Listing 12. IACloseDocument
example.
In the IACloseDocument
call shown above,
the
plug-in calls back through the IAPluginInitBlock
structure to close the file. The pointer to the IAPluginInitBlock
structure is coerced from the inDoc
parameter where
a copy was saved during the
IAOpenDocument
call
shown in
Listing 11.
/* IAGetNextTextRun example implementation. */ OSStatus IAGetNextTextRun( IADocRef inDoc, void* buffer, UInt32* size) { /* local variables */ IADocAccessorPtr docAccessor; OSStatus err; /* verify parameters */ if (!inDoc) return errIAParamErr; /* Cast IADocRef to what we defined it to be in IAOpenDocument (in this case a IADocAccessorPtr). */ docAccessor = (IADocAccessorPtr)inDoc; /* callback to read from the file. */ err = CallIADocumentAccessorRead(docAcce ssor, buffer, size); /* return result o read operation */ return err; } |
Listing 13. IAGetNextTextRun
example.
In the IACloseDocument
call shown above,
the
plug-in calls back through the IAPluginInitBlock
structure to read data bytes from the file. The pointer to the
IAPluginInitBlock
structure is coerced from the inDoc
parameter where
a copy was saved during the
IAOpenDocument
call
shown in
Listing 11.
/* IAGetTextRunInfo example implementation. */ OSStatus IAGetTextRunInfo( IADocRef inDoc, char** outEncoding, char** outLanguage) { /* we don't know the encoding or language of the file so set to NULL. */ if (outEncoding != NULL) *outEncoding = NULL; if (outLanguage != NULL) *outLanguage = NULL; /* local variables */ return errIANoErr; } |
Listing 14. IAGetTextRunInfo example.
In this example, we return NULL
, indicating that both
the text encoding and the language are unknown.
Calling a Text Extractor Plug-in from an Application
Following is an example of how a client may use a Text Extractor Plug-in to extract the text of a document. Applications may use these routines or some variant of them to call Text Extractor Plug-ins to extract text from virtually any document type.
The steps below show how to set up the plug-in's code fragment, set up the callback structures, and finally how to call the plug-in to perform the text extraction. This example does not show how to find or determine the correct plug-in for a particular document.
Setting up a Text Extractor Plug-in
First, we begin by setting up the plug-in's code fragment for execution and storing pointers to the routines we want to call in a structure we will use to access the plug-in. Listing 15 contains the routines and declarations used to perform this task.
/* The following typedefs correspond to the routines exported by Text Extractor Plug-ins. In this example, we use these for calling the plug-in from our code. */ typedef OSStatus (*PluginInitCallPtr)( IAPluginInitBlockPtr initBlock, IAPluginRef* outPluginRef); typedef OSStatus (*PluginTermCallPtr)( IAPluginRef inPluginRef); typedef OSStatus (*GetExtractorVersionCallPtr)( IAPluginRef inPluginRef, UInt32* outPluginVersion); typedef OSStatus (*CountSupportedDocTypesCallPtr)( IAPluginRef inPluginRef, UInt32* outCount); typedef OSStatus (*GetIndSupportedDocTypeCallPtr)( IAPluginRef inPluginRef, UInt32 inIndex, char** outMIMEType); typedef OSStatus (*OpenDocumentCallPtr)( IAPluginRef inPluginRef, IADocAccessorPtr inDoc, IADocRef* outDoc); typedef OSStatus (*CloseDocumentCallPtr)(IADocRef inDoc); typedef OSStatus (*GetTextRunInfoCallPtr)(IADocRef inDoc, char** outEncoding, char** outLanguage); typedef OSStatus (*GetNextTextRunCallPtr)( IADocRef inDoc, void* buffer, UInt32* size); /* ExtractorRec is used for storing information about the plug-in's code fragment itself. it contains pointers to the fragment's routines, and the fragment's CFM connection id number. */ typedef struct { CFragConnectionID connID; PluginInitCallPtr PluginInit; PluginTermCallPtr PluginTerm; GetExtractorVersionCallPtr GetExtractorVersion; CountSupportedDocTypesCallPtr CountSupportedDocTypes; GetIndSupportedDocTypeCallPtr GetIndSupportedDocType; OpenDocumentCallPtr OpenDocument; CloseDocumentCallPtr CloseDocument; GetNextTextRunCallPtr GetNextTextRun; GetTextRunInfoCallPtr GetTextRunInfo; } ExtractorRec, *ExtractorRecPtr; /* OpenExtractor loads the code fragment belonging to the Text Extractor Plug-in referred to by the file system specification record referred to by its spec parameter. If successful, it returns a pointer to a structure containing pointers to the plug-in's routines. */ static OSStatus OpenExtractor(FSSpec *spec, ExtractorRecPtr *extractor) { ExtractorRecPtr extr; Str63 fragName; Ptr mainAddr; Str255 errName; Boolean fragmentExists; /* tracks contents of fragConnID */ CFragConnectionID fragConnID; CFragSymbolClass symbolClass; OSStatus err; /* set up locals to a known state */ extr = NULL; fragmentExists = false; /* allocate the storage for saving information about the plug-in. */ extr = (ExtractorRecPtr) NewPtrClear(sizeof(ExtractorRec)); if (extr == NULL) { err = memFullErr; goto bail; } /* set up the plug-in's code fragment for use. */ err = GetDiskFragment(spec, 0, kCFragGoesToEOF, fragName, kPrivateCFragCopy, &fragConnID, &mainAddr, errName); if (err != noErr) goto bail; fragmentExists = true; extr->connID = fragConnID; /* save pointers to the routines we want to call. */ err = FindSymbol(fragConnID, "\pIAPluginInit", (Ptr*) &extr->PluginInit, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAPluginTerm", (Ptr*) &extr->PluginTerm, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAGetExtractorVersion", (Ptr*) &extr->GetExtractorVersion, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIACountSupportedDocTypes", (Ptr*) &extr->CountSupportedDocTypes, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAGetIndSupportedDocType", (Ptr*) &extr->GetIndSupportedDocType, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAOpenDocument", (Ptr*) &extr->OpenDocument, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIACloseDocument", (Ptr*) &extr->CloseDocument, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAGetNextTextRun", (Ptr*) &extr->GetNextTextRun, &symbolClass); if (err != noErr) goto bail; err = FindSymbol(fragConnID, "\pIAGetTextRunInfo", (Ptr*) &extr->GetTextRunInfo, &symbolClass); if (err != noErr) goto bail; /* return successfully */ *extractor = extr; return noErr; bail: if (fragmentExists) CloseConnection(&fragConnID); if (extr != NULL) DisposePtr((Ptr) extr); return err; } /* CloseExtractor unloads the plug-in's code fragment and releases storage allocated when it was opened. */ static void CloseExtractor(ExtractorRecPtr extr) { /* close the code fragment manager connection to the plug-in's file. */ CloseConnection(&extr->connID); /* release the memory we were using to track the plug-in's code fragment. */ DisposePtr((Ptr) extr); } |
Listing 15. Routines for setting up a Text Extractor Plug-in's code fragment for execution.
The prototypes provided in Listing 15 allow us to call back to
the plug-in.
Pointers to these routines are stored in the ExtractorRec
structure.
Setting up the IAPluginInitBlock
structure
Routines for setting up a IAPluginInitBlock
structure are provided in Listing 16. Here, callbacks used by the
plug-in
are referenced in the structure saving routine descriptors referring to
them in the structure.
/* routines exported in the IAPluginInitBlock record. Here we have defined our own set of routines that call through to the Mac OS memory manager. */ static void* MyIAAlloc(UInt32 inSize) { return (void*) NewPtr(inSize); } static void MyIAFreeProc(void* object) { DisposePtr((Ptr) object); } static UInt8 MyIAIdleProc(void) { return 0; } /* NewIAPluginInitBlock allocates a new init block record containing memory allocation routines and idle routines that can be called by a plug-in. If an error occurs, the function returns NULL. */ static OSStatus NewIAPluginInitBlock(IAPluginInitBlockPtr *iapBlock) { IAPluginInitBlockPtr iBlock; OSStatus err; iBlock = NULL; iBlock = (IAPluginInitBlockPtr) NewPtrClear(sizeof(IAPluginInitBlock)); if (iBlock == NULL) { err = memFullErr; goto bail; } iBlock->Alloc = NewIAAllocProc(MyIAAlloc); if (iBlock->Alloc == NULL) { err = memFullErr; goto bail; } iBlock->Free = NewIAFreeProc(MyIAFreeProc); if (iBlock->Free == NULL) { err = memFullErr; goto bail; } iBlock->Idle = NewIAIdleProc(MyIAIdleProc); if (iBlock->Idle == NULL) { err = memFullErr; goto bail; } *iapBlock = iBlock; return noErr; bail: if (iBlock != NULL) { if (iBlock->Alloc != NULL) DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Alloc); if (iBlock->Free != NULL) DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Free); if (iBlock->Idle != NULL) DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Idle); DisposePtr((Ptr) iBlock); } return err; } /* DisposeIAPluginInitBlock releases the memory occupied by the init block record allocated in NewIAPluginInitBlock. */ static void DisposeIAPluginInitBlock(IAPluginInitBlockPtr iBlock) { DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Alloc); DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Free); DisposeRoutineDescriptor((UniversalProcPtr) iBlock->Idle); DisposePtr((Ptr) iBlock); } |
Listing 16. Routines for allocating and initializing an
IAPluginInitBlock
structure.
The routines provided in Listing 16 allocate and deallocate the
IAPluginInitBlock
structure to use routines that call the Memory Manager.
Setting up the IADocAccessorRecord
structure
The routines and declarations provided in Listing 17 illustrate how to set up the file access callbacks for a plug-in. Here, we allocate the callback structure and another structure for keeping track off the file itself.
/* MyDocumentReference contains information used by the caller to track the input source being used by the plug-in. In this example, we are using a Mac OS file. A pointer to this structure will be passed back to our file io routines. */ typedef struct { FSSpec spec; /* a copy of the file specification record */ Boolean docOpen; /* true when document is open */ short refnum; /* file reference number */ } MyDocumentReference, *MyDocRefPtr; /* in this example, we will fill the fields of the IADocAccessorRecord with routine descriptors referring to routines that call through to the Mac OS file system. These routines are defined below. */ static OSStatus MyIADocAccessorOpenProc( IADocAccessorRef inAccessor) { MyDocRefPtr refptr; OSErr err; refptr = (MyDocRefPtr) inAccessor; err = FSpOpenDF(&refptr->spec, fsRdPerm, &refptr->refnum); if (err == noErr) refptr->docOpen = true; return (OSStatus) err; } static OSStatus MyIADocAccessorCloseProc( IADocAccessorRef inAccessor) { MyDocRefPtr refptr; refptr = (MyDocRefPtr) inAccessor; if ( ! refptr->docOpen) return errIAParamErr; FSClose(refptr->refnum); refptr->docOpen = false; return errIANoErr; } static OSStatus MyIADocAccessorReadProc( IADocAccessorRef inAccessor, void* buffer, UInt32* ioSize) { MyDocRefPtr refptr; OSErr err; refptr = (MyDocRefPtr) inAccessor; if ( ! refptr->docOpen) return errIAParamErr; /* read the data */ err = FSRead(refptr->refnum, (long *) ioSize, buffer); /* special case for end of file errors */ if (err == eofErr && *ioSize != 0) err = noErr; return (OSStatus) err; } static OSStatus MyIASetDocAccessorReadPositionProc( IADocAccessorRef inAccessor, SInt32 inMode, SInt32 inOffset) { MyDocRefPtr refptr; OSErr err; refptr = (MyDocRefPtr) inAccessor; if ( ! refptr->docOpen) return errIAParamErr; switch (inMode) { case kIAFromStartMode: err = SetFPos(refptr->refnum, fsFromStart, inOffset); break; case kIAFromCurrMode: err = SetFPos(refptr->refnum, fsFromMark, inOffset); break; case kIAFromEndMode: err = SetFPos(refptr->refnum, fsFromLEOF, inOffset); break; default: err = errIAParamErr; break; } return (OSStatus) err; } static OSStatus MyIAGetDocAccessorReadPositionProc( IADocAccessorRef inAccessor, SInt32* outPostion) { MyDocRefPtr refptr; OSErr err; refptr = (MyDocRefPtr) inAccessor; if ( ! refptr->docOpen) return errIAParamErr; err = GetFPos(refptr->refnum, outPostion); return (OSStatus) err; } static OSStatus MyIAGetDocAccessorEOFProc( IADocAccessorRef inAccessor, SInt32* outEOF) { MyDocRefPtr refptr; OSErr err; refptr = (MyDocRefPtr) inAccessor; if ( ! refptr->docOpen) return errIAParamErr; err = GetEOF(refptr->refnum, outEOF); return (OSStatus) err; } /* NewIADocAccessorRec initializes a IADocAccessorRecord with routine descriptors referring to routines that call through to the Mac OS file system. It stores a record containing information about the file in the docAccessor field of the IADocAccessorRecord record. If an error occurs, th function returns NULL. */ static OSStatus NewIADocAccessorRec( FSSpec *targetFile, IADocAccessorPtr *docAccRec) { IADocAccessorPtr docAcc; MyDocRefPtr refptr; OSStatus err; refptr = NULL; refptr = (MyDocRefPtr) NewPtrClear(sizeof(MyDocumentReference)); if (refptr == NULL) goto bail; refptr->spec = *targetFile; refptr->docOpen = false; refptr->refnum = 0; docAcc = (IADocAccessorPtr) NewPtrClear(sizeof(IADocAccessorRecord)); if (docAcc == NULL) { err = memFullErr; goto bail; } docAcc->docAccessor = (IADocAccessorRef) refptr; docAcc->OpenDoc = NewIADocAccessorOpenProc(MyIADocAccessorOpenProc); if (docAcc->OpenDoc == NULL) { err = memFullErr; goto bail; } docAcc->CloseDoc = NewIADocAccessorCloseProc(MyIADocAccessorCloseProc); if (docAcc->CloseDoc == NULL) { err = memFullErr; goto bail; } docAcc->ReadDoc = NewIADocAccessorReadProc(MyIADocAccessorReadProc); if (docAcc->ReadDoc == NULL) { err = memFullErr; goto bail; } docAcc->SetReadPosition = NewIASetDocAccessorReadPositionProc( MyIASetDocAccessorReadPositionProc); if (docAcc->SetReadPosition == NULL) { err = memFullErr; goto bail; } docAcc->GetReadPosition = NewIAGetDocAccessorReadPositionProc( MyIAGetDocAccessorReadPositionProc); if (docAcc->GetReadPosition == NULL) { err = memFullErr; goto bail; } docAcc->GetEOF = NewIAGetDocAccessorEOFProc( MyIAGetDocAccessorEOFProc); if (docAcc->GetEOF == NULL) { err = memFullErr; goto bail; } *docAccRec = docAcc; return noErr; bail: if (refptr != NULL) DisposePtr((Ptr) refptr); if (docAcc != NULL) { if (docAcc->OpenDoc != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->OpenDoc); if (docAcc->CloseDoc != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->CloseDoc); if (docAcc->ReadDoc != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->ReadDoc); if (docAcc->SetReadPosition != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->SetReadPosition); if (docAcc->GetReadPosition != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->GetReadPosition); if (docAcc->GetEOF != NULL) DisposeRoutineDescriptor((UniversalProcPtr) docAcc->GetEOF); DisposePtr((Ptr)docAcc); } return err; } /* DisposeIADocAccessorRec releases a IADocAccessorRecord allocated by NewIADocAccessorRec. All o the sub fields are deallocated, and, if the file is open, it is closed before the structure is deallocated. */ static void DisposeIADocAccessorRec(IADocAccessorPtr docAcc) { MyDocRefPtr refptr; /* destroy the document reference */ refptr = (MyDocRefPtr) docAcc->docAccessor; /* make sure the file is closed - incase we're aborting */ if (refptr->docOpen) FSClose(refptr->refnum); DisposePtr((Ptr) refptr); /* release the accessor structure */ DisposeRoutineDescriptor((UniversalProcPtr) docAcc->OpenDoc); DisposeRoutineDescriptor((UniversalProcPtr) docAcc->CloseDoc); DisposeRoutineDescriptor((UniversalProcPtr) docAcc->ReadDoc); DisposeRoutineDescriptor((UniversalProcPtr) docAcc->SetReadPosition); DisposeRoutineDescriptor((UniversalProcPtr) docAcc->GetReadPosition); DisposeRoutineDescriptor((UniversalProcPtr) docAcc->GetEOF); DisposePtr((Ptr) docAcc); } |
Listing 17. Routines for allocating and initializing a
IADocAccessorRecord
.
In Listing 17, we use File Manager calls to access the file.
For tracking information
used by the File Manager, we store a pointer to a private structure
containing that
information in the docAccessor
field of the IADocAccessorRecord
.
Calling a Text Extractor Plug-in
The routine provided in Listing 18 calls the Text Extractor Plug-in to gather textual information from a file. The text gathered from the file is passed back to the caller through a routine the caller provides as a parameter.
/* kETBufferSize determines the size of the buffer allocated for retrieving chunks of text. */ #define kETBufferSize (1024*1) /* TextSinkProc is a call back routine provided by the caller. Text will be passed to this routine as it is extracted from the file. */ typedef OSErr (*TextSinkProc)(void* text, long length, long refcon); /* ExtractTextFromFile calls the Text Extractor Plug-in referred to by *theExtractor to extract text from the file referred to by *targetFile. While extracting text, the text will be sent to the TextSinkProc provided by the textsink parameter. refcon is a value passed through to the TextSinkProc in its refcon parameter. */ static OSErr ExtractTextFromFile( FSSpec *targetFile, FSSpec *theExtractor, TextSinkProc textsink, long refcon) { ExtractorRecPtr extractor; IAPluginInitBlockPtr initblock; IADocAccessorPtr accRec; UInt32 pluginVersion; Boolean exInited, docOpen; IADocRef docRef; Ptr etBuffer; UInt32 bytecount; OSStatus err; IAPluginRef pluginRef; /* set up locals to a known state */ extractor = NULL; initblock = NULL; accRec = NULL; exInited = false; docOpen = false; etBuffer = NULL; /* initialize the plug-in */ err = OpenExtractor(theExtractor, &extractor); if (err != noErr) goto bail; /* initialize the callbacks used by the plug-in for basic memory tasks. */ err = NewIAPluginInitBlock(&initblock); if (err != noErr) goto bail; /* call the plug-in's initialization routine. */ err = extractor->PluginInit(initblock, &pluginRef); if (err != noErr) goto bail; exInited = true; /* query the plug-in to find out if we're using the interface we're using is in sync with the interface it was built to use. */ err = extractor->GetExtractorVersion(pluginRef, &pluginVersion); if (err != noErr) goto bail; if (pluginVersion != kIAExtractorVersion1) { err = errIAParamErr; goto bail; } /* initialize the callbacks used by the plug-in for file input with our document. */ err = NewIADocAccessorRec(targetFile, &accRec); if (err != noErr) goto bail; /* allocate a memory buffer for reading */ etBuffer = NewPtr(kETBufferSize); if (etBuffer == NULL) { err = memFullErr; goto bail; } /* call the plug-in and ask it to open the document for input. */ err = extractor->OpenDocument(pluginRef, accRec, &docRef); if (err != noErr) goto bail; docOpen = true; /* Here, we loop until the plug-in returns no more bytes */ while (true) { /* attempt to fill the entire buffer with text. */ bytecount = kETBufferSize; err = extractor->GetNextTextRun(docRef, etBuffer, &bytecount); /* if some other error occurs, such as eofErr... we exit... */ if (err != noErr) goto bail; /* errIAEndOfTextRun is returned when the language encoding changes. in this case, we do nothing, but in some cases we may wish to do some additional processing. */ if (err == errIAEndOfTextRun) { /* we don't check the bytecount here because conceivably errIAEndOfTextRun could be returned with a zero sized buffer simply to indicate the beginning of a new character encoding range in cases where the last call read all of the characters from the last encoding run. */ /* normal termination occurs when zero bytes are returned. */ } else if (bytecount == 0) break; /* at this point, we have a chunk of text from the from the document. Here, we pass it back to the caller's sink. */ err = textsink(etBuffer, bytecount, refcon); if (err != noErr) goto bail; } /* at this point, all of the text in the document has been read. Now, we close down the document by asking the plug-in to close, disposing of the memory buffer, and then disposing the file input callback structure. DisposeIADocAccessorRec is defined in Listing 17. */ extractor->CloseDocument(docRef); docOpen = false; DisposePtr(etBuffer); etBuffer = NULL; DisposeIADocAccessorRec(accRec); accRec = NULL; /* After closing the document, the plug-in is released. This is done by calling the plug-in's termination procedure, releasing the memory allocation callbacks (DisposeIAPluginInitBlock is defined in Listing 16) and then releasing the plug-in's code fragment (CloseExtractor is defined in Listing 15). */ extractor->PluginTerm(pluginRef); exInited = false; DisposeIAPluginInitBlock(initblock); initblock = NULL; CloseExtractor(extractor); extractor = NULL; /* return success */ return noErr; bail: /* error handling code. note, ordering of the recovery statements is important. */ if (docOpen) extractor->CloseDocument(docRef); if (etBuffer != NULL) DisposePtr(etBuffer); if (accRec != NULL) DisposeIADocAccessorRec(accRec); if (exInited) extractor->PluginTerm(pluginRef); if (initblock != NULL) DisposeIAPluginInitBlock(initblock); if (extractor != NULL) CloseExtractor(extractor); return err; } |
Listing 18. Sample routine for that calls a Text Extractor Plug-in.
The routine provided in Listing 18 performs the actual text extraction by calling the plug-in's routines directly. In this example, no attention is paid to the language encoding or character encoding, but this example could easily be modified to return this information. This routine uses structures and calls routines defined in Listing 15, Listing 16, and Listing 17.
Index of Code Listings
The following code listings are provided in this document. Listings 5 through 14 define the content of the sample plug-in, and listings 15 through 18 illustrate how to call a plug-in from an application.
- Listing 1. A sample HTML file.
- Listing 2. A sample
'mimp'
resource for PDF files. - Listing 3.
Declaration of
the
IAPluginInitBlock
structure and prototypes that can be used for calling the routines referenced in the structure. - Listing 4. Declaration of
the
IADocAccessorRecord
structure and prototypes that can be used for calling the routines referenced in the structure. - Listing 5. File header & imports for Text Extractor Plug-ins.
- Listing 6.
IAPluginInit
example. - Listing 7.
IAPluginTerm
example. - Listing
8.
IAGetExtractorVersion
example. - Listing
9.
IACountSupportedDocTypes
example. - Listing
10.
IAGetIndSupportedDocType
example. - Listing 11.
IAOpenDocument
example. - Listing 12.
IACloseDocument
example. - Listing 13.
IAGetNextTextRun
example. - Listing 14.
IAGetTextRunInfo
example. - Listing 15. Routines for setting up a Text Extractor Plug-in's code fragment for execution.
- Listing 16. Routines for
allocating and initializing an
IAPluginInitBlock
structure. - Listing
17. Routines for allocating and initializing a
IADocAccessorRecord
. - Listing 18. Sample routine for that calls a Text Extractor Plug-in.
References
Technote TN1141, "Extending and Controlling Sherlock."
Technote TN1180, "Sherlock's Find By Content Library."
RFC1521, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies." N. Borenstein, N. Freed. September 1993.
Downloadables
Acrobat version of this Note (124K). |