Documentation Archive Developer
Search

ADC Home > Reference Library > Technical Notes > Carbon > File Management >

Not Recommended Documentclose button

Important: The information in this document is Not Recommended and should not be used for new development.

Current information on this Reference Library topic can be found here:

Searching Volumes - Solutions and Problems

CONTENTS

This Technical Note discusses the PBCatSearch function and tells why it should be used. It also provides simple algorithms for searching both MFSand HFS volumes and discusses the problems with indexed search routines.

This note includes information on PBCatSearch and notes the problems with indexed search routines. Thanks to John Norstad at Northwestern University for pointing out some of the shortcomings of the indexed search routines. Thanks to the System 7 engineering team for adding PBCatSearch.

[Oct 01 1988]






Introduction

It may be necessary to search the volume hierarchy for files or directories with specific characteristics. Generally speaking, your application should avoid searching entire volumes because searching can be a very time-consuming process on a large volume. Your application should rely instead on files being in specific directories (the same directory as the application, or in one of the system-related folders that can be found with FindFolder) or on having the user find files with Standard File.

Back to top

Searching MFS Volumes

Under MFS, indexed calls to PBGetFInfo return information about all files on a given volume. Under HFS, the same technique returns information only about files in the current directory. Here's a short code snippet showing how to use PBGetFInfo to list all files on an MFS volume:

    PROCEDURE EnumMFS (theVRefNum: Integer);
    { search the MFS volume specified by theVRefNum }
       VAR
          pb: ParamBlockRec;
          itemName: Str255;
          index: Integer;
          err: OSErr;
    BEGIN
       WITH pb DO
          BEGIN
             ioNamePtr := @itemName;
             ioVRefNum := theVRefNum;
             ioFVersNum := 0;
          END;
       index := 1;
       REPEAT
          pb.ioFDirIndex := index;
          err := PBGetFInfoSync(@pb);
          IF err = noErr THEN
             BEGIN
                { do something useful with the file information in pb }
             END;
          index := index + 1;
       UNTIL err <> noErr;
    END;

As noted in Macintosh Technical Note #66, a directory signature of $D2D7 means a volume is an MFS volume, while a directory signature of $4244 means the volume is an HFS volume.

Back to top

Searching HFS Volumes

Fast, Reliable Searches Using PBCatSearch

The fastest and most reliable way to search an HFS volume's catalog is with the File Manager's PBCatSearch function. PBCatSearch returns a list of FSSpec records to files or directories that match the search criteria specified by your application. However, PBCatSearch is not available on all volumes or under all versions of the File Manager. Volumes that support PBCatSearch can be identified using the PBHGetVolParms function. (See the following code.) Versions of the File Manager that support PBCatSearch can be identified with the gestaltFSAttr Gestalt selector and gestaltFullExtFSDispatching bit as shown in the following code:

    FUNCTION HasCatSearch (vRefNum: Integer): Boolean;
    { See if volume specified by vRefNum supports PBCatSearch }
       VAR
          pb: HParamBlockRec;
          infoBuffer: GetVolParmsInfoBuffer;
          attrib: LongInt;

    BEGIN
       HasCatSearch := FALSE; { default to no PBCatSearch support }
       IF GestaltAvailable THEN { See Inside Macintosh Volume VI, Chapter 3 }
          IF Gestalt(gestaltFSAttr, attrib) = noErr THEN
             IF BTst(attrib, gestaltFullExtFSDispatching) THEN
                BEGIN { this version of the File Manager can call PBCatSearch }
                   WITH pb DO
                      BEGIN
                         ioNamePtr := NIL;
                         ioVRefNum := vRefNum;
                         ioBuffer := @infoBuffer;
                         ioReqCount := sizeof(infoBuffer);
                      END;
                   IF PBHGetVolParmsSync(@pb) = noErr THEN
                      IF BTST(infoBuffer.vMAttrib, bHasCatSearch) THEN
                         HasCatSearch := TRUE; { volume supports PBCatSearch }
                END;
    END;

Note:
File servers that support the AppleTalk Filing Protocol (AFP) version 2.1 support PBCatSearch. That includes volumes and directories shared by System 7 File Sharing and by the AppleShare 3.0 file server. Although AFP version 2.1 supports PBCatSearch, the fsSBNegate bit is not supported in the ioSearchBits field. Using PBCatSearch to ask the file server to perform the search is usually faster than using the recursive indexed search described in the next section.


PBCatSearch should be used if it is available because it is usually much faster than a recursive search. For example, the search time for finding all files and directories on a recent Developer CD was around 18 seconds with PBCatSearch. It took 6 minutes and 36 seconds with a recursive indexed search. How long do you want the users of your application to wait?

PBCatSearch can be used to collect a list of FSSpec records to all items on a volume by setting ioSearchBits in the parameter block to 0.

Recursive Indexed Searches Using PBGetCatInfo

When PBCatSearch is not available, an application must resort to a recursive indexed search. There are a couple of potential problems with a recursive indexed search; a recursive indexed search can use up a lot of stack space and the volume directory structure can change in the multi-user/multiprocess Macintosh environment. The example code in this note addresses the stack space problem, but for reasons explained later, does not address problems caused by multiple users or processes changing the volume directory structure during a recursive search.

The default stack space on the Macintosh can be as small as 8K; therefore, the recursive indexed search example shown in this Note encloses the actual recursive routine in a shell that can hold most of the variables needed, which dramatically reduces the size of the stack frame. This example uses only 26 bytes of stack space each time the routine recurses. That is, it could search 100 levels deep (pretty unlikely) and use only 2600 bytes of stack space.

Please notice that when the routine comes back from recursing, it has to clear the nonlocal variable err to noErr, since the reason the routine came back from recursing is that PBGetCatInfo returned an error:

    EnumerateCatalog(myCPB.ioDrDirID);
    err := noErr; {clear error return on way back}

Please notice also that you must set myCPB.ioDrDirId each time you call PBGetCatInfo, because if PBGetCatInfo gets information about a file, it returns ioFlNum (the file number) in the same location that ioDrDirID previously occupied.

Be sure to check bit 4, the fifth least significant bit, when you check the file attributes bit to see if you've got a file or a folder. The following routine uses MPW Pascal's BTST function to check that bit. If you use the Toolbox bit manipulation routines (e.g., BitTst), remember to order the bits in reverse order from standard 68000 notation.

Here is the routine in MPW Pascal:

PROCEDURE EnumerShell (vRefNumToSearch: Integer; { the vRefNum to search}
                           dirIDToSearch: LongInt);  { the dirID to search }
       VAR
          itemName: Str63;
          myCPB: CInfoPBRec;
          err: OSErr;

       {-----}

       PROCEDURE EnumerateCatalog (dirIDToSearch: LongInt);
          CONST
             ioDirFlgBit = 4;
          VAR
             index: Integer;
       BEGIN { EnumerateCatalog }
          index := 1;
          REPEAT
             WITH myCBP DO
                BEGIN
                   ioFDirIndex := index;
                   ioDrDirID := dirIDToSearch; { we need to do this every }
                                               { time through }
                   filler2 := 0; { Clear the ioACUser byte if search is  }
                                 { interested in it. Nonserver volumes }
                                 { won't clear it for you and the value  }
                                 { returned is meaningless. }
                END;
             err := PBGetCatInfo(@myCPB, FALSE);
             IF err = noErr THEN
                IF BTST(myCPB.ioFlAttrib, ioDirFlgBit) THEN
                   BEGIN { we have a directory }

                      { do something useful with the directory information }
                      { in myCPB }

                      EnumerateCatalog(myCPB.ioDrDirID);
                      err := noErr; {clear error return on way back}
                   END
                ELSE
                   BEGIN { we have a file }

                      { do something useful with the file information }
                      { in myCPB }

                   END;
             index := index + 1;
          UNTIL (err <> noErr);
       END;  { EnumerateCatalog }

       {-----}

    BEGIN { EnumerShell }
       WITH myCPB DO
          BEGIN
             ioNamePtr := @itemName;
             ioVRefNum := vRefNumToSearch;
          END;
       EnumerateCatalog(dirIDToSearch);
    END; { EnumerShell }

In MPW C:

    /* the following variables are globals */
    HFileInfo      gMyCPB;           /* for the PBGetCatInfo call */
    Str63          gItemName;        /* place to hold file name */
    OSErr          gErr;             /* the usual */

/*---------------------------------------------------------------------*/

    void EnumerateCatalog (long int dirIDToSearch)
    {  /* EnumerateCatalog */

       short int          index=1;
       do
       {
          gMyCPB.ioFDirIndex= index;
          gMyCPB.ioDirID= dirIDToSearch; /* we need to do this every time    */
                                         /* through, since GetCatInfo        */
                                         /* returns ioFlNum in this field */
          gMyCPB.filler2= 0; /* Clear the ioACUser byte if search is         */
                             /* interested in it. Nonserver volumes won't  */
                             /* clear it for you and the value returned is */
                             /* meaningless. */
          gErr= PBGetCatInfo(&gMyCPB,false);
          if (gErr == noErr)
          {
             if ((gMyCPB.ioFlAttrib & ioDirMask) != 0)
             {  /* we have a directory */

                /* do something useful with the directory information */
                /* in gMyCPB */

                EnumerateCatalog(gMyCPB.ioDirID); /* recurse */
                gErr = noErr; /* clear error return on way back */
             }
             else
             {  /* we have a file */

                /* do something useful with the file information */
                /* in gMyCPB */

             }
          }
          ++index;
       } while (gErr == noErr);
    }  /* EnumerateCatalog */

/*---------------------------------------------------------------------*/

    EnumerShell(short int vRefNumToSearch, long int dirIDToSearch)

    {  /* EnumerShell */
       gMyCPB.ioNamePtr = gItemName;
       gMyCPB.ioVRefNum = vRefNumToSearch;
       EnumerateCatalog(dirIDToSearch);
    }  /* EnumerShell */

Please make sure that you are running under HFS before you use this routine (see Technical Note #66). You can search the entire volume by specifying a starting directory ID of fsRtDirID, the root directory constant. You can do partial searches of a volume by specifying a starting directory ID other than fsRtDirID.

Back to top

Searching in a Multi-user/Multiprocess Environment

Volumes can be shared by multiple users accessing a file server or multiple processes running on a single Macintosh. Each user or process with access to such a shared volume may be able to make changes to the volume's catalog at any time. Changes in a volume's catalog in the middle of a search can cause two problems:

  • Files and directories renamed or moved by another user or process can be entirely missed or found multiple times by a search routine.
  • A search routine can easily lose track of its position within the hierarchical directory structure when files or directories are created, deleted, or renamed by another user or process.

A volume searched with a single call to PBCatSearch ensures that all parts of the volume are searched without another user or process changing the volume's catalog. However, a single call to PBCatSearch may not be possible or practical because of the number of matches you expect, or because you may want to set a time limit on the search so that the user can cancel a long search. PBCatSearch returns a catChangedErr (-1304) and no matches when the catalog of a volume is changed by another user or process in a way that might affect the current search. The search can be continued with the CatPositionRec returned with the catChangedErr error, but at the risk of missing catalog entries or finding duplicate catalog entries.

Things aren't so nice for search routines based on indexed File Manager calls. The File Manager won't notify you when a volume's catalog has changed. In fact, there are several ways the catalog can change that are very difficult to detect and correct for. Since methods that attempt to resynchronize an indexed search and find all catalog entries that might be missed or found multiple times when the catalog changes do not work for all cases, those methods are not discussed in this Technical Note. The following paragraphs describe why some changes are very difficult to detect.

There are three changes you can make to the contents of a directory that change the list of files and directories returned by an indexed search: creating, deleting, and renaming. Directories of an HFS volume are always sorted alphabetically, so when a file or subdirectory is deleted from a directory, any directory entries after it bubbles up to fill the vacated entry position; when a file or subdirectory is created, it is inserted into the list and all entries after it bubbles down one position. When a file or subdirectory is renamed, it is removed from its current position and moved into its alphabetically correct position. The first two changes, creating and deleting, can be detected only at the parent directory level. That's because a creation or deletion changes only the modification date of the parent directory but not the modification date of any of the parent directory's ancestors. Renaming a file or subdirectory does not change the modification date of the file or subdirectory renamed or the modification date its parent directory, but it does change the order of files and subdirectories found by an indexed search.

With this in mind, here are a couple of examples that are very difficult to detect.

The first example shows a file, Dashboard, moved (by another user or process) with PBCatMove from the CDevs subdirectory to the Control Panels subdirectory. (See figures 1 and 2.) At the time of the move, the search routine has just finished recursively looking through the Development directory and is ready to recursively search the Games directory. After the move, two directories, CDevs and Control Panels, have new modification dates but no change is seen at the root directory of My Disk. There is nothing to immediately tell the search routine something has changed (except for the volume modification date which may or may not mean the directory structure has changed), so the search will see Dashboard twice. If the move were in the opposite direction, from Control Panels to CDevs, Dashboard would be missed by the search routine.

Before Dashboard Is Moved With PBCatMove

Figure 1. Before Dashboard Is Moved With PBCatMove

After Dashboard Is Moved With PBCatMove

Figure 2. After Dashboard Is Moved With PBCatMove

The second example (see Figures 3 and 4) shows a directory, Toys, renamed (by another user or process) with PBHRename to Games. At the time of the move, the search routine has seen the files Aardvark and Letter and is looking at the third object in the directory, the file Résumé. After the move, the index pointer is still pointing at the third object but now the third object is the file Letter, a file that has already been seen by the search. This change cannot be detected by looking at the parent directory's modification date because PBHRename does not change any modification dates. However, this change can be detected by checking to see if the index pointer still points to the same file or directory. The search routine could re-index through the directory to find the Résumé file again and start searching from there, but what about the directory that was renamed? The search routine either must miss it (and its contents) or it must repeat the search of the entire directory to ensure nothing is missed.

Before Toys Is Renamed With PBHRename

Figure 3. Before Toys Is Renamed With PBHRename

After Toys Is Renamed to Games With PBHRename

Figure 4. After Toys Is Renamed to Games With PBHRename

As these examples show, a change during a search of a hierarchical directory structure with indexed File Manager calls involves the risk of missing catalog entries or finding duplicate catalog entries. If your application depends on seeing all items on a volume at least once and only once, you should make the users of your application aware of the problems associated with indexed searches and suggest to them ways to make sure the volume's catalog is not changed during the indexed search. Here's a good suggestion you could make to the user: do not use other programs during the search. Other programs may create, delete, or rename files during the search.

Back to top

Summary

You should always use PBCatSearch to search a volume if it is available. If PBCatSearch isn't available and you must use an indexed search, be aware that it is difficult to ensure that you do not miss some catalog entries or see some catalog entries multiple times during your search.

Back to top

References

Inside Macintosh, Volume IV, The File Manager

Inside Macintosh, Volume V, File Manager Extensions in a Shared Environment

Inside Macintosh, Volume VI, The Finder Interface

Inside Macintosh, Volume VI, The File Manager

Technical Note M.FL.ActiveFS -- Determining Which File System Is Active

Technical Note M.FL.PBSharePBUnshare -- PBShare, PBUnshare, and PBGetUGEntry

Back to top

Downloadables

Acrobat gif

Acrobat version of this Note (188K)

Download


Back to top