FRBRization of a Library Catalog : Better Collocation of Records , Leading to Enhanced Search , Retrieval , and Display

The Functional Requirements for Bibliographic Records (FRBR)’s hierarchical system defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems. Certain library materials (especially audio-visual formats) pose notable challenges to search and retrieval; the first benefits of a FRBRized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. This report will summarize the benefits of FRBR to nextgeneration library catalogs and OPACs, and will review the handful of ILS and catalog systems currently operating with its theoretical structure.

The Functional Requirements for Bibliographic Records (FRBR)'s hierarchical system defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems.Certain library materials (especially audio-visual formats) pose notable challenges to search and retrieval; the first benefits of a FRBRized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature-the bulk of the non-science, technology, and mathematics collections.This report will summarize the benefits of FRBR to nextgeneration library catalogs and OPACs, and will review the handful of ILS and catalog systems currently operating with its theoretical structure.
Editor's note: This article is the winner of the LITA/ Ex Libris Writing Award, 2007.T he following review addresses the challenges and benefits of a next-generation online public access catalog (OPAC) according to the Functional Requirements for Bibliographic Records (FRBR). 1 After a brief recapitulation of the challenges posed by certain library materials-specifically, but not limited to, audiovisual materials-this report will present FRBR's benefits as a means of organizing the database and public search results from an OPAC. 2 FRBR's hierarchical system of records defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems; it thus affords both library users and staff a more streamlined navigation between related items in different materials formats and among editions and adaptations of a work.In the eight years since the FRBR report's publication, a handful of working systems have been developed.The first benefits of such a system to an average academic library system would be felt in a branch music library, but research already has proven its advantages for fine arts, theology, and literature-the bulk of the non-science, technology, and mathematics collections.

■ Current search and retrieval challenges
The difficulties faced first, but not exclusively, by music users of most integrated library systems fall into two related categories: issues of materials formats, and issues of cataloging, indexing, and MARC record structure.Music libraries must collect, catalog, and support materi-als in more formats than anyone else; this makes their experience of the most common ILS modules-circulation, reserves, and acquisitions-by definition more complicated.
The study of music continues to rely on the interrelated use of three distinct information formats-scores (the notated manifestation of a composer's or improviser's thought), recordings (realizations in sound, and sometimes video, of such compositions and improvisations), and books and journals (intellectual thought regarding such compositions and improvisations)-music libraries continue to require . . .collections that integrate [emphasis mine] these three information formats appropriately. 3 Put a different way, "relatedness is a pervasive characteristic of music materials." 4This is why FRBR's model of bibliographic relationships offers benefits that will first impact the music collection. 5t present, however, musical formats pose search and retrieval challenges for most ILS users, and the problem is certainly replicated with microforms and video recordings.The MARC codes distinguish between material formats, but they support only one category for sound recordings, lumping together CD, DVD audio, cassette tape, reel-toreel tape, and all other types. 6This single "sound recording" definition is easily reflected in OPACs (such as those powered by Innovative Interfaces' Millennium and Ex Libris' Aleph 500) and union catalogs (such as WorldCat.org). 7However, the distinction between sound recording formats is embedded in subfields of the 007 field, which presently cannot be indexed by many library automation systems because the subfields are not adjacent.
An even more central challenge derives from the fact that music sound recordings-such as journals and essay collections-contain within each item more than one work.Thus, for one of the central material formats collected by a music library (as well as by a public library or other academic branches), users routinely find themselves searching for a distinct subset of the item record.Perversely, though music catalogers do tend to include analytic added-entries for the subparts of a CD recording or printed score, and major ILS vendors are learning to index them, AACR2 guidelines set arbitrary cutoff points of about fifteen tracks on a sound recording, and three performable units within a score. 8Subsets of essay collections and journal runs are routinely exposed to users' searches by indexing and abstracting services and major databases, but subsets of libraries' music collections depend upon catalogers to exploit the MARC records for user access. 9 In light of these pervasive bibliographic relationships, catalogers of music (again, with parallels in other subjects) have developed a distinctive approach to the MARC metadata schema.In particular, they-with their colleagues in literature, fine arts, and theology-rely upon the 700t field for uniform work titles, and upon careful authority control. 10However, once again, many major ILS portals have spotty records in affording access to library collections via these data.Innovative Interfaces' Millennium, though it clearly leads other major library products in this market, frequently frustrates music librarians (it is, of course, not alone in doing so). 11Its automatic authority control feature works poorly with (necessary) music authority records. 12And even though Innovative has been one of the first vendors to add a database index to the 700t field, partly in response to concerns expressed to the company by the Music Librarians' User Group, Millennium apparently does not allow for an appropriate level of follow-through on searching. 13n initial search by name of a major composer, for instance, yields a huge and cluttered result set containing all indexed 700t fields. 14The results do helpfully include the appropriate see also references, but those references disappear in a subsidiary (limited) search.In addition, the subsidiary display inexplicably changes to an unhelpful arrangement of generic 245 fields ("Mozart, Symphonies"; "Mozart, Operas, Excerpts").Similar challenges will be faced by other parts of an academic or large public library collection, including the literature collections (for works such as Shakespeare's plays), fine arts (for images and artists' works), and theology (for works whose uniform title is in Latin).
The OPAC interfaces of other major ILS vendors fare little better.The same search (for "Mozart") on the Emory University Library catalog (with an ILS by SirsiDynix), similarly yields a rich results set of more than one thousand records, and poses similar problems in refining the search. 15In the case of this OPAC, an index of 700t fields also exists, but it only may be searched from the inside of a single record; as with Millennium, SirsiDynix's interface will then group the next set of results confusingly by 245 fields.The Library Corporation's Carl-X apparently does not contain a 700t index; the simple "Mozart" search returns a muchsimplified set of only 97 results organized by 245a fields, and thus offers a more concise set of results but avoids the most incisive index for audio-visual materials. 16Ex Libris offers a somewhat more helpful display of its more restricted results; unfortunately for the present comparison, though the detailed results set does list the "format" of all Mozart-authored items, the same term-"Music"-is used for sound recordings, musical scores, and score excerpts, with no attempt logically to group the results around individual works. 17No 700t index appears present.
■ THE FRBR paradigm: review of literature and theory From the earliest library catalogs in the modern age, the tools of bibliographic organization have sought to afford users both access to the collection and collocation of related materials.Anglo-American cataloging practice has traditionally served the first function by main entries and alternate access points and the second function by classification systems.However, as knowledge increases in scope and complexity, the systems of bibliographic control have needed to evolve.As early as the 1950s, theories were developing that sought to distinguish between the intellectual content of a work, and its often manifold physical embodiments. 18The 1961 Paris International Conference on Cataloging Principles first reified within the cataloging community a work-item distinction, though even the 1988 publication of the Anglo-American Cataloging Rules, 2nd ed., "continued to demonstrate confusion about the nature . . . of works." 19eanwhile, extensive research into the nature of bibliographic relationships groped toward a consensus definition of the entity-types that could encompass such relationships. 20Ed O'Neill and Diane Vizine-Goetz examined some one hundred editions of Smollett's The Expedition of Humphrey Clinker over a two-hundred-year span of publication history to propose a hierarchical set of definitions to define entity levels. 21The theoretical entities include the intellectual content of a work-which in the case of audio-visual works, may not even exist in any printed formats-the various versions, editions, and printings in which that intellectual content manifests itself, and the specific copies of each manifestation which a library may hold. 22Research has discovered such clusters of bibliographically related entities for as much as 50 percent or more of all the intellectual works in any given library catalog, and as many as 85 percent of the works in a music catalog. 23This work laid the foundation for FRBR (and, once again, incidentally underscored the breadth of its applicability to, and beyond, music catalogs).
The theoretical framework of FRBR is most concisely set forth in the Final Report of the IFLA study group.The long-awaited publication traces its genesis to the 1990 Stockholm Seminar, and the resultant 1992 founding of the ILFA Study Group on Functional Requirements for Bibliographic Records.The study group set out to develop: a framework that identifies and clearly defines the entities of interest to users of bibliographic records, the attributes of each entity, and the types of relationships that operate between entities . . .a conceptual model that would serve as the basis for relating specific attributes and relationships . . . to the various tasks that users perform when consulting bibliographic records.
The study makes no a priori assumptions about the bibliographic record itself, either in terms of content or structure. 24 other words, the intention of the group's deliberations and the Final Report is to present a model for understanding bibliographic entities and the relationships between them to support information organization tools.It specifically adopts an approach that defines classes of entities based upon how users, rather than catalogers, approach bibliographic records-or, by natural extension, any system of metadata.
The FRBR hierarchical entities comprise a fourfold set of definitions: In fact, LIS research has tended to demonstrate what music librarians have always understood-that relatedness among items and complexity of families is most prevalent in audio-visual collections.Even before the IFLA Report had been penned, Sherry Vellucci had set out the task: "To create new catalog structures that better serve the needs of the music user community, it is important first to understand the exact nature and complexity of the materials to be described in the catalog." 26Even limiting herself to musical scores alone (that is, no recordings or monographs), Vellucci found that more than 94.8 percent of her sample exhibited at least one bibliographic relationship with another entity in the collection; she further related this finding to the very "inherent nature of music, which requires performance for its aural realization," as opposed to, for example, monographic book printing. 27ellucci and others have frequently commented on how the relatedness of manifestations-in different formats, arrangements, and abridgements-of musical works continues to be a problem for information retrieval in the world of music bibliography. 28sical works have been variously and industriously described by musicologists and music bibliographers.Yet, in the information retrieval domain [and, I might add, under both AACR and AACR2] . . .systems for bibliographic information retrieval . . .have been designed with the document as the key entity, and works have been dismissed as too abstract . . . 29e work is the access point many users will bring-in their minds, and thus in their queries-to a system.They intend, however, to discover, identify, and obtain specific manifestations of that work.Very recently, research has begun to demonstrate that the FRBR model can offer specific advantages to music retrieval in cases such as these: "the description of bibliographic data in a FRBR-based database leads to less redundancy and a clearer presentation of the relationships which are implicit in the traditional databases found in libraries today." 30Explorations of the theory in view of the benefits to other disciplines, such as audio-visual and other graphic materials, maps, oral literature, and rare books, have appeared in the literature as well. 31The admitted weakness of the FRBR theory, of course, is that it remains a theory at its inception, with still preciously few working applications.

■ FRBR applications
Working implementations of FRBR to catalogs, OPACs, and ILSs are still relatively few but promise much for the future.The FRBR theoretical framework has remained an area of intense research at OCLC, which has even led to some prototype applications and, very recently, deployment in the WorldCat Local interface. 32A scattered few other researchers have crafted FRBR catalogs and catalog displays for their own ends; the Library of Congress has a prototype as well.Innovative, the leading academic ILS vendor, announced a FRBR feature for 2005 release, yet shelved the project for lack of a beta-testing partner library. 33Ex Libris' Primo discovery tool, one other complete ILS (by Visionary Technologies for Library Systems, or VTLS), and the National Library of Australia, have each deployed operational FRBR applications. 34The number of projects testifies to the high level of interest among the cataloging and information science communities, while the relatively small number of successful applications testifies to the difficulties faced.
OCLC has engaged in a number of research projects and prototypes in order to explore ways that FRBRization of bibliographic records could enhance information access.OCLC Research frequently notes the potential streamlining of library cataloging by FRBRization; in addition they have experienced "superior presentation" and "more intuitive clustering" of search results when the model is incorporated into systems. 35Work-level definitions stand behind such OCLC Research prototypes as Audience Level, Dewey Browser, FictionFinder, xISBN, and Live Search.In every case, researchers determined that, though it was very difficult to automate any identification of expressions, application of work-level categories both simplifies and improves search result sets. 36n algorithm common to several of these applications is freely available as an open source application, and now as a public interface option in OCLC's WorldCat Local. 37he algorithm creates an author/title key to cluster worksets (often at a higher level than the FRBR work, as in the case of the two distinct works that are the book and screenplay for Gone with the Wind).In the public search interface, the results sets may be grouped at the work level; users may then execute a more granular search for "all editions," an option that then displays the group of expressions linked to the work record.Unfortunately, as the software does not use 700t fields (its intention is to travel up the entity hierarchy, and it uses the 1xx, 24x, and 130 fields), its usefulness in solving the above challenges may not be immediate.A somewhat similar application (though Merrilee Proffitt declares it not to be a FRBR product) was RedLightGreen, a user interface for the ex-RLG union catalog based upon quasi-FRBR clustering. 38he reports from designers of other automated systems offer interesting commentaries on the process.The team building an automatically FRBRized database and user interface for AustLit-a new union collection of Australian literature among eight academic libraries and the National Library of Australia-acknowledged some difficulty with non-monographic works such as poems, though the majority of their database consisted of simpler work-manifestation pairs. 39Based on strongly positive user feedback ("The presentation of information about related works [is] both useful and comprehensible"), a similar application was attempted on the Australian national music gateway MusicAustralia; it is unclear whether the project was shelved due to difficulties in automating the FRBRization process. 40ne recent application created for the Perseus Digital Library adopts a somewhat different approach. 41Rather than altering previously created MARC records to allow hierarchical relationships to surface, this team created new records using crosswalks between MARC and, for instance, MODS, for work-level records.They claim some moderate level of success; though once again, their discussion of the process is more illuminating than their product.Mimno and Crane successfully allowed a single manifestation-level record to link upwards to many expressions, a necessary analytic feature especially for dealing with sound recordings.They did practically demonstrate the difficulty of searching elements from different levels of the hierarchy at the same time (such as work title and translator), a complication predicted by Yee. 42hree ILS vendors have released products that use the FRBR model: Portia (VisualCat), Ex Libris (Primo), and VTLS (Virtua). 43The first product, a cataloging utility from a smaller player in the vendor market, claims to incorporate FRBR into its metadata capture, yet the information available does not explain how, nor do they offer an OPAC to exploit it.The 2007 release of Ex Libris' Primo offers what the company calls "FRBR groupings" of results. 44This discovery tool is not itself an ILS, but promises to interoperate with major existing ILS products to consolidate search results.It remains unclear at this time how Ex Libris' "standard FRBR algorithms" actually group records; the single deployment in the Danish Royal Library allows searching for more records with the same title, for instance, but does not distinguish between translations of the same work. 45TLS, on the other hand, has since 2004 offered a complete product that has the potential to modify existing MARC records-via local linking tags in the 001 and 004 fields-to create FRBR relationships. 46Their own studies agreed with OCLC that a subset, roughly 18 percent, of existing catalog records (most heavily concentrated in music collections) would benefit from the process, and they thus allow for "mixed" catalogs, with only subsets (or even individually selected records) to be FRBRized.The company's own information suggests relatively simple implementation by library catalogers, coupled with robust functionality for users, and may be the leading edge of the next generation of catalog products.

■ FRBR solutions
The ILFA Study Group, following its user-centered approach, set out a list of specific tasks that users of a computer-aided catalog should be able to accomplish: ■ to find all manifestations embodying certain criteria, or to find a specific manifestation given identifying information about it; ■ to identify a work, and to identify expressions and manifestations of that work; ■ to select among works, among expressions, and among manifestations; and ■ to obtain a particular manifestation once selected.
It seems clear that the FRBR model offers a framework of relationships that can aid each task.Unfortunately, none of the currently available commercial solutions may be in themselves completely applicable for a single library.The OCLC Work-set Algorithm is open source, as well as easily available through WorldCat Local, but it only works to create super-work records; it also ignores the 700t field so crucial to many of the issues noted above.None of the other home-grown applications may have code available to an institution.The Virtua module from VTLS offers a very tempting solution, but may require a change of vendor. 47ither adapting one of these solutions or designing a local application, then, raises the question: What would the ideal system entail?Catalog FRBRization will transpire in two segments: enhancing the existing catalog to add bibliographic relationships to surface in the retrieval phase, and designing or adaptating a new interface and display to reflect the relationships. 48The first task may prove the more formidable, due to the size of even a modest catalog database and the difficulties often observed in automating such a task; while the librarians constructing the AustLit system found a relatively high percentage of records could be transferred en masse, the OCLC Research team had difficulty automatically pinpointing expressions from current MARC records. 49espite current technology trends toward users' application of tags, reviews, and other metadata, a task as specialized as adding bibliographic relationships to the catalog demands specialized cataloging professionals. 50The best approach within a current library structure may be to create a single new position to head the project and to act as liaison with cataloging staff in the various branches and with vendor staff, if applicable.Each library branch may judge on its own the proportions of records to FRBRize, beginning with high-traffic works and authors, those for whom search results tend to be the most overwhelming and confusing to users.Each branch can be responsible for allocation of cataloging staff effort to the process, and will thus have specialist oversight of subsets of the database.
Three technical solutions to actually changing the database structure have been attempted in the literature to date: incrementally improving the existing MARC records to better reflect bibliographic relationships, add-ing local linking tags, and simply creating new metadata schemas.The VTLS solution of adding local linking tags seems most appropriate; relationships between records are created and maintained via unique identifiers and linking statements in the 001 and 004 fields. 51OCLC's open source software could expedite the creation of work-level records, and the creation of expression-level records will be made easier by the large amount of bibliographic information already present in the current catalog.Wherever possible, cataloging staff also should take the opportunity to verify or create links to authority files so as to enhance retrieval. 52reating a new catalog display option could be accomplished via additions to current OPAC coding, either by adopting WorldCat Local or by designing parts of a new local interface.It need not even require a complete revision; the single site (UCL) currently deploying VTLS' FRBRized interface maintains a mixed catalog and offers, once again, a highly intuitive model. 53When a searcher comes across a bibliographic record for which FRBR linking is available, they may click a link to open a new display screen.We should strive, however, to use simple interface statements such as "View all different kinds of holdings," "This work has x editions, in y languages" or "This version of the work has been published z times" (both the OCLC prototype and the AustLit Gateway offer such helpful and user-friendly statements).Though the foundational work of both Tillett and Smiraglia focused upon taxonomies of relationships, the hierarchical structure of the IFLA proposal should remain at the forefront of the display, with a secondary organization by type of relationship or type of entity.Rather than adopting a design which automatically refreshes at each click, a tree organization of the display should be more user-friendly, allowing users to maintain a visual sense of the organization that they are encountering (see Appendix for screenshots of this type of tree display). 54Format information should be included in the display, as an indication of a users' primary category, as well as a distinction among expressions of a work.
With these changes, the library catalog will begin to afford its users better access to many of its core collections.FRBRization of even part of the catalog-concentrating on high-incidence authors, as identified by subject specialists-will allow it better to reflect, and collocate, items within the families of bibliographic relationships that have been acknowledged a part of library collections for decades.This increased collocation will begin to counteract the pitfalls of mere keyword searching on the part of users, especially in conjunction with renewed authority work.Finally, FRBR offers a display option in a revamped OPAC that is at the same time simpler than current result lists, and more elegant in its reflection of relatedness among items.Each feature should better enable the users of our catalog to find, select, and obtain appropriate resources, and will bring our libraries into the next generation of cataloging practice.
2. This paper began as a graduate research assignment for LIS 60640 (Library Automation), in the Kent State University MLIS program, March 19, 2007.My thanks to Jennifer Hambrick, Nancy Lensenmayer, and Joan Lippincott, for their helpful comments on earlier drafts.The curricular assignment asked for a library automation proposal in a specific library setting; the original review contained a set of recommendations concerning FRBR through the lens of a (fictional) medium-sized academic library system, that of St. Hildegard of Bingen Catholic University.As will be noted below, the branch Music Library typically serves a small population of music majors (graduate and undergraduate) within such an institution, but also a large portion of the student body that use the library's collection to support their music coursework and arts distribution requirements.Any music library's proportion of the overall system's holdings may be relatively small, but will include materials in a diverse set of formats: monographs, serials, musical scores, sound recordings in several formats (cassette tapes, LPs, CDs, and streaming audio files), and a growing collection of video recordings, likewise in several formats (VHS, laser discs, and DVD).It thus offers an early test case for difficulties with an automated library system. 5.The OPAC for the University of Huddersfield Library system famously first deployed a search option for related items ("Did you mean . . .?"); http://www.hud.ac.uk/cls (accessed July 10, 2007).FRBR not only offers the related item search, but also logically groups related works throughout the library catalog.
6. Allyson Carlyle demonstrated empirically that users value an object's format as one of the first distinguishing features: "User Categorization of Works: Toward Improved Organization of Online Catalog Displays," Journal of Documentation 55, no. 2 (Mar.1999): 184-208 at 197.
7. Millennium will feature heavily in the following discussion, both because of its position leading the academic library automation market (being adopted wholesale by, for instance, the Ohio statewide academic library consortium), and because it was the subject of the original paper.
8. See Alastair Boyd, "The Worst of Both Worlds: How Old Rules and New Interfaces Hinder Access to Music," CAML Review 33, no. 3 (Nov.2005) 12. Several prominent music librarians only discovered that Innovative's system had such a feature when instances of the automatic system's changing carefully crafted music authority records were discovered; Mark Sharff (Washington University in St. Louis) and Deborah Pierce (University of Washington), postings to Innovative Music Users' Group electronic discussion list, Oct. 6, 2006, archive accessed Feb. 1, 2007.13.Music librarians are the only subset of the Millennium users to have formed their own Innovate Users' Group.Sirsi-Dynix has a separate Users' Group for STM librarians, and Ex Libris hosts a Law Librarians' Users' Group, two other groups whose interaction with the ILS poses discipline-specific challenges.
16. Searches performed on the library of Oklahoma State University, http://www.library.okstate.edu(accessed June 27, 2007); TLC has considered making FRBRization a possible feature of their product.They offer some concatenation of "intellectually similar bibliographic records," and "TLC continues to monitor emerging FRBR standards"; Don Kaiser, personal communication to the author, July 8, 2007.I was unable to reach representatives of SirsiDynix on this issue.17.Searches performed on the MIT Library catalog, powered by ALEPH 500 http://libraries.mit.edu(accessed June 27,  2007).

■
Work: "a distinct intellectual or artistic creation"; ■ Expression: "the intellectual or artistic realization of a work" in any combination of forms (including editions, arrangements, adaptations, translations, performances, etc.); ■ Manifestation: "the physical embodiment of an expression of a work"; and ■ Item: "a single exemplar of a manifestation."25Examples of these hierarchical levels abound in the bibliographic universe, but frequently music offers the quickest examples: ■ Work: Mozart's Die Zauberflöte (The Magic Flute) ■ Work: Puccini's La Bohéme ■ Expression: The composer's complete musical score (1896) ■ Manifestation: Edition of the score printed by Ricordi in 1897 ■ Expression: An English language edition for piano and voices ■ Expression: A performance by Mirella Freni, Luciano Pavarotti, and the Berlin Philharmonic Orchestra (October 1972) ■ Manifestation: A recording of this perfor mance released on 33¹/ ³ RPM sound discs in 1972 by London Records ■ Manifestation: A re-release of the same per formance on compact disc in 1987 by London Records ■ Item: The copy of the compact disc held by the Columbus Metropolitan Library ■ Item: The copy of the compact disc held by the University of Cincinnati
, http://www.yorku.ca/caml/review/33-3/both_worlds.htm(accessedMar.12,2007);MichaelGormanand Paul W. Winkler, eds., Anglo-American Cataloging  Rules, 2nd ed.(Chicago: ALA, 1988).9.In the past few years, a small subset of the search literature has described technical efforts to develop search engines that can query by musical example; see J. Stephen Downie, "The Scientific Evaluation of Music Information Retrieval Systems: Foundations and Future," Computer MusicJournal 28, no. 2 (Sum- mer 2004): 12-23.A company called Melodis Corporation has recently announced a successful launch of a query-by-humming search engine, though a verdict from the music community remains out; http://www.midomi.com(accessed Jan.31, 2007).10.See Velluci, "Music Metadata and Authority Control in an International Context"; Richard P. Smiraglia, "Uniform Titles for Music: An Exercise in Collocating Works," Cataloging and Classification Quarterly 9, no. 3 (1989): 97-114; Steven H. Wright, "Music Librarianship at the Turn of the Century: Technology," Notes-Quarterly Journal of the Music Library Association 56, no. 3 (Mar.2000): 591-97.Each author builds upon the foundational work of Barbara Tillett, "Bibliographic Relationships: Toward a Conceptual Structure of Bibliographic Information Used in Cataloging" (Ph.D. diss., University of California at Los Angeles, 1987).11. "At conferences, [my colleagues] are always groaning if they are a Voyager client," interview with an academic music librarian by the author, Feb. 9, 2007.