OCLC Search Key Usage Patterns in a Large Research Library

Many libraries use the OCLC Online Union Catalog and Shared Cataloging Subsystem to perform various library functions, such as acquisitions and cataloging of library materials. As an initial part of the operations, users must search and retrieve a bibliographic record for the desired item from the large OC LC database. Various types of derived search keys are available for retrieval. This study of actual search keys entered by users of the OCLC online system was conducted to determine the types of search keys users prefer for performing various library operations and to find out whether the preferred search keys are effective.


OCLC Search Key Usage Patterns in a Large Research Library
In the last decade, many information systems have been developed that use search keys to retrieve bibliographic records from large databases. The OCLC Online Union Catalog and Shared Cataloging Subsystem in particular is one of the larger of these systems. 1 --u There are currently more than 7 million bibliographic records in the OCLC database. The OCLC online system uses search keys to access various index files that locate bibliographic records in the database. Index files are maintained for name/title, personal author, corporate author, CODEN, ISBN, and LCCN indexes. The first four of the above index files contain search keys that are derived from information (e. g., author, title) present in the piece or citation. Search keys in these four indexes are in general not unique, because the derived key could be the same for different bibliographic records. The last three indexes (CODEN, ISBN, and LCCN) contain search keys or identifiers that are unique in general. A user enters a search key consisting of characters (letters, numbers, symbols, commas, hyphens) formatted according to specific rules that identify to the system which index file to search. For example, to search the name/title index, the user enters a search key consisting of the first four characters of the author's last name and the first four characters of the first nonarticle word of the title of the work, separated by a comma. To search the title index, the user enters a search key consisting of the first three characters of the first nonarticle word in the title, the first two characters of the second word, the first two characters of the third word, and the first character of the fourth word, each separated by a comma. 7 The system compares the user-entered search key with the search keys contained in that index file. This comparison results in one of three possible cases: l. Only one index file search key matches the user-entered search key. 2. More than one index file search key matches the user-entered search key. 3. No index file search key matches the user-entered search key. In the first case, the system retrieves the unique bibliographic record corresponding to the search key and displays it on the user's terminal screen. In the second case, the system retrieves all records that correspond to the search key, prepares truncated entries (consisting of author, title, imprint data, etc.) for those records, and displays the truncated entries on the user's terminal screen. The user then selects the truncated entry that corresponds to the desired record and requests the system to display the full record for that item. In the third case, the system responds with the reply that a record matching the user-entered search key was not present (a "not found" response) in the index.
In the OCLC online system, 2,500 member libraries ·using 3,800 terminals search the OCLC database to perform various library functions such as acquisitions, monograph cataloging, and serials cataloging. Users can choose to enter any type of search key from the various types of search keys permitted by the system. Users' preferences to enter a particular type of search key will depend in part upon the kind of information they have about the item to be searched and the type of library function they wish to perform. If users receive a "not found" response after entering a particular type of search key, they may then try a different type of search key that they consider next best.
The purpose of this study was to determine what types of search keys are preferred to perform various library functions and whether the preferred search keys are effective. The study also investigated what type of search key is used next when particular types of search keys are unable to retrieve the desired record to determine if there are any discernible search patterns.

MATERIALS AND METHODS
For conducting this study, data were needed on the pattern of searchkey use in OCLC member libraries. Further, the data had to include the actual time of day when work was performed for a particular library function on a specific terminal. This requirement would permit identification in the Online System Use Data collected by OCLC of search keys entered to perform specific library functions. Ideally, a library with several OCLC terminals, each used exclusively for only one library function, was desired. The Ohio State University (OSU) Library met this requirement. The OSU Library has eleven terminals: two of the eleven terminals are used exclusively for performing acquisition functions, seven are used for monographic cataloging, and one terminal each is used for serials cataloging and public use. The terminal assigned for serials cataloging is used for monograph cataloging after 5 p.m. Library staff at OSU use all the terminals exclusively, except for the public-use terminal. This public-use terminal can be used by anyone, including faculty, students, and library staff.
Two full days' transactions for each of the OSU terminals were obtained from the OCLC Online System Use Statistics (OLSUS) file. During the online operation, the system writes a record on the OLSUS file for each message entered by the user. This record includes the institution number, a number identifying the terminal from which the message came, the time of the transaction, and the first nonblank sixteen characters of the message . If the user-entered message is a search key, the system response is either a "not found" response or a "found" response.
With the "found" response, the system displays the bibliographic record (if unique) or displays a truncated entry screen. However, a "found" response does not necessarily mean that the truncated entry screen includes information about the bibliographic record the user was actually seeking.
For the study, a program was written to scan the records in the OLSUS file for two full days in October 1978. The program extracted all the records for messages that came from the eleven OSU terminals and wrote the records on two tapes--one for each day's activity. These tapes were sorted first by the terminal number and then within each terminal number by the time of transaction. Each sorted tape was fed to another program that printed, for each terminal, the actual messages in chronological order and the associated system response.
From this printout, it was possible manually to go through the complete sequence of messages entered to search a single bibliographic item. The printout for an entire day's activity for each terminal was thus divided into sections, each section containing all transactions that were performed to search for a single item. For each section, the type of search key first entered and the system response was noted. In case of a "not found" response, the type of search key next entered (if the search process was continued for the item) also was noted. The results were combined for all the terminals used to perform a specific library function (e.g., acquisitions) and for the two days. Table 1 and figure 1 show the different types of search keys used as the first choice to perform various library functions. Note that at the time of data collection for this study, the Interlibrary Loan Subsystem was not operational. During the two-day period, a total of 605 items were searched for monograph cataloging, 296 items were searched for acquisitions operations, and 94 items were searched for serials cataloging. A total of 158 items were. searched on the public-use terminal. Most types of search keys were used to some extent. The use of ISBN and ISSN search keys was quite limited for all types of library functions. The CODEN search key was used only twice, and both times through the public-use terminal. The corporate author search key was not used at all. The use of the personal-author search key was much smaller than expected. This was probably because at the time of the study the system did not permit use of personal author keys during peak hours (9 a.m. to 5 p.m.) of online system operation.

RESULTS AND DISCUSSION
For the acquisitions function, the LCCN search key was used most often, followed by the name/title key. These two types of keys together were used for about 80 percent of the acquisitions items searched. For the monograph cataloging function, the most frequently used search key was the name/title key. This key was entered for about 52 percent of items searched. The next most frequently used key for monograph cataloging was the LCCN key, used for about 33 percent of the items searched. For the serials cataloging function, the title key was used most often, for more than 75 percent of the items searched. Searches performed through the public-use terminal included all types of search keys. The name/title key was used most frequently , followed by the title key.
Before performing an actual search, a user must choose, from among the various types of search keys available in the OCLC system, the particular search key to use. If the search key used for a first try (primary choice of search key) results in a "not found" response from the system, a second key may be entered (secondary choice of search key). This sequence may continue through many search-key choices until the user retrieves the desired record ("found" response) or decides to abandon the search at some point upon obtaining a "not found" response. For this study, the investigation was confined to onlyprimary and secondary choices of search keys. The results of the "found" responses for the primary choice of key and for the secondary search key entered after receiving the first "not-found" response are presented in tables 2 through 5.
For the acquisitions function (table 2), the most frequently used primary search key was the LCCN key, which retrieved the desired record about 89 percent of the time. When the LCCN key could not retrieve the record, the user chose mostly the name/title key as his/her secondary choice or abandoned the search. The next most frequently used primary search key was the name/title key, which retrieved the desired record about 51 percent of the time. When the name/title key was unsuccessful, the users entered as their secondary search key a title key Note: To calculate the percentage given in parentheses, the number of ''Types of Search Key Used after the First Not-found Response" was divided by the number of "Not-found Responses." about 41 percent of the time, or a different name/title key about 31 percent of the time. Approximately 26 percent of the time they abandoned the search. It seems that acquisitions users mostly try the LCCN key first if available (the LCCN is not present in all the records) and the name/title key first if the LCCN is not available. Thus, users adopted the right approach since the LCCN key has· the highest hit rate. Furthermore, the LCCN key is more efficient than other keys because it results, on the average, in a fewer number of replies.
For the monograph cataloging function (table 3), the name/title key was used most often as the primary search key, resulting in retrieval of the desired record about 57 percent of the time. When the name/title key could not retrieve the record, the users next attempted a title key (52 percent of the time) or a different name/title (21 percent of the time). About 23 percent of the time they discontinued the search. The LCCN key was the second most frequently used primary search key and successfully retrieved the record about 79 percent of the time. When the LCCN key was unsuccessful, the users tried the name/title key (58 percent of the time) as their secondary choice or abandoned the search. Unlike the search-key usage pattern for acquisitions, the use of the LCCN key for monograph cataloging was lower than use of the name/ title key, although here also the hit rate was highest for the LCCN key. The reason the LCCN use was lower is that Ohio State University, being a research institution, processes a large number of items from var- Note: To calculate the percentage given in parentheses, the number of ''Types of Search Key Used after the First Not-found Response" was divided by the number of "Not-found Responses." ious sources other than regular acquisitions channels, and many of these sources do not have LCCN information.
For the serials cataloging function (table 4), the title key was the first primary choice and retrieved the desired records 44 percent of the time. If this key failed to retrieve the desired records, the users entered as their secondary key a different title key 55 percent of the time and a name/title key 17 percent of the time. Approximately 23 percent of the time, users decided to discontinue the search. Although for serials cataloging the title key was used most frequently, its hit rate was less than 45 percent. On the other hand, the ISSN key was used very little, but its hit rate was as high as 80 percent. The use of the ISSN key is likely to increase in the future, however, because the United States Postal Service now requires the ISSN to be present on serials. 8 Therefore, the ISSN will be more readily available to the user.
Among the searches performed through the public-use terminal (table 5), the most frequently used primary search key was the name/title key, which resulted in a successful search about 29 percent of the time. When patrons encountered a "not found" response, they tried as their secondary choice a different name/title key 29 percent of the time, or a title key 29 percent of the time. They abandoned the search 38 percent of the time. As mentioned earlier, the public-use terminal can be used by anyone, including faculty and students. The hit rate for name/title  Note: To calculate the percentage given in parentheses, the number of "Types of Search Key Used after the First Not-found Response" was divided by the number of "Not-found Responses." Note: To calculate the percentaee given in parentheses, the number of "Types of Search Key Used after the First Not-found Response" was divided by the number of "Not-found Responses." key at this terminal was rather low. From this study, it is not possible to say whether this was due to patrons' lack of knowledge in key construction or lack of sufficient information needed for the construction of the key.

SUMMARY AND CONCLUSIONS
Among various types of search keys available to the users, the name/ title, LCCN, and title search keys were entered most frequently. The use of personal-author, ISBN, ISSN, and CODEN search keys was very limited for all library functions. Corporate-author search keys were not used at all.
For the acquisitions function, system users most frequently entered the LCCN key, followed by the name/title key. For monograph cataloging, the users entered the name/title key most frequently, followed by the LCCN key. For serials cataloging, the use of the title key was the most common. Persons using public-use terminals entered mostly name/ title and title search keys.
For acquisitions and monograph cataloging functions, the LCCN key was most successful in retrieving the desired records. The next most successful key was the name/title key. For both of these functions, when the name/title key failed to retrieve the record, users next tried the title key most of the time.
For serials cataloging, the title key was used most frequently but was not very successful in retrieving serial records. On the other hand, the ISSN key was the most successful but it was used very little.
Individual identifiers such as LCCN, ISSN, ISBN, and CODEN are very efficient search keys because they retrieve, on the average, far fewer numbers of replies than other types of search keys. With the exception of LCCN, the individual indentifiers were used only to a small extent. From this study, it is not possible to answer questions such as: Why weren't individual identifiers' search keys not used more often? Did a searcher use a name/title key even when the LCCN was available? To answer such questions, data will have to be collected concerning what kind of information is available to the searcher when constructing the search keys.