REGIONAL NUMERICAL UNION CATALOG ON COMPUTER OUTPUT MICROFICHE

A union catalog of 1,100,000 books on computer output microfiche (COM) in twenty-one Louisiana libraries is described. The catalog, called LNR for Louisiana Numerical Register, consists not of bibliographic information, but primarily of the LC card number and letter codes for the libraries holding the book. The computer programs, the data bank, and output are described. The programs provide the capability for listing over two million entries. Also described are the statistical tabulations which are a by-product of the system and which provide a rich source for analysis.

an older bibliographic Louisiana Union Catalog.All books listed in the Register are those having a Library of Congress ( LC ) card number; indeed the LC card number is the entry.The term "numerical" was chosen because we anticipate using other numbers besides the LC number-e.g., the Mansell number, and the International Standard Book Number ( ISBN ).
The LC card number is the most widely used book number we now have.This fact is put to good use by the Library of Congress in its own NUC-Register of Additional Locations.There are other LC number indexes, but they are not union catalogs.(The Mansell number, of course, will be very useful when publication of the NUC-Pre-1956 Imprints is complete.) Many more titles can be represented on a page by number codes than by complete bibliographic data, at a ratio of perhaps 600 to 9. Unit costs are, therefore, much less.The first edition ( 1971 ) containing 550,000 volumes was produced for an estimated total cost of $22,600-$8,600 grant plus $14,000 absorbed.One hundred copies of the Register were printed in hard copy form with approximate overall unit costs for keypunching, computer, travel, salaries, and printing, as follows :

In terms of actual expenditures
In terms of total funds, (grant funds) expended plus absorbed Per title entry 2.5¢ 6.0¢ Per volume entry 1.6¢ 3.8¢ The second edition (November 1972) contains over 1,100,000 volumes and in terms of the second grant, was produced on Computer Output Microfiche for an estimated total cost of $31,200, i.e., $10,000 grant plus $21,200 absorbed.(Reproduction costs for the COM are negligible.For an original copy of 5 fiche, containing all1,100,000 volumes, we were charged $25 by a commercial firm, and for extra copies, $3 each.Copies for distribution will be sold at a slightly higher price.)Unit costs for the COM edition are:

In terms of In terms of total actual expenditures funds, second grant (seco nd grant funds) expenditures plus absorbed
Per title entry 1.8¢ 5.6¢ Per volume entry .9¢2.8¢ Unit costs computed on the basis of total costs to date suggest that they remain relatively constant from cumulation to cumulation.The concept of a numerical register is not new.The idea was discussed at length in a proposal by Harry Dewey ( 1) almost a generation ago in which he espoused all the essential ideas, and again in 1965 by Louis Schreiber ( 2).Both argued that if the bibliographic data including the LC card number were already in hand, one could then merely look up the number in a numerical union catalog to determine a location.Goldstein and others ( 3 ) have also studied what they called the "Schreiber catalog" and have produced a sample computer printout of LC numbers.Computer output microfiche, on the other hand, was not anticipated in the original concept.It has made reproduction and distribution cheap, fast, and eminently feasible.The history of the Register and its rationale have been discussed more fully by McGrath ( 4).

PROGRAMS COMPRISING THE UNION CATALOG SYSTEM
The Union Catalog data record is shown in Table 1.The first three fields are the familiar LC card number, and the fourth, the library location.
( 1 ) Alpha series prefix -this data field may contain from 1 to 4 alphabetic characters denoting a special series.(2) Numeric series prefix-this data field may contain 1 or 2 digits.
( 3) Serial number -this data field may contain up to 6 numeric digits.( 4) Alphabetic library designation code-this field contains a preassigned alphabetic code (up to 26) designating the participating library.
The three programs which use this data record and comprise the Union Catalog System are shown in Figure 1 and described below.

LNREDT PROGRAM
LNREDT is an editing program which examines all card input data to determine whether they are acceptable or not.
Each data field as shown above is examined as follows: Field 1 for the presence and rejection of nonalphabetic characters, and also to determine if the alphabetic code is a member of the accepted set of codes obtained from the Library of Congress; the accepted records are transferred after checking all fields to a magnetic tape file for subsequent use; rejected data records are printed and visually scanned for the source of error; Fields 2 and 3 for the presence and rejection of nonnumeric characters; Field 4 to determine if alphabetic.
LNRSRT PROGRAM LNRSRT sorts all records on the above mentioned tape file.The major sort key is the numeric prefix, Field 2. The minor sort keys in order of the sort sequence are: Field 1-the alphabetic special series indicator; Field 3-the book serial number; Field 4-the library code designation.
LNRLST PROGRAM LNRLST is the main program which uses the sorted data tape to:  a. create a single record for each unique LC number containing the library code designation of each library having this particular book; b. produce a listing of the above records in LC card number order; c. generate records of unique titles in combinations of libraries owning the titles; d. enter into a memory matrix the combinations of libraries created in part (c); combinations are then counted; each time a combination is encountered, the matrix is searched for a match; if a match is found, the corresponding matrix position is incremented by one; if no match is found, a new matrix position is created with the new combination and the corresponding count initialized to one; this routine also provides for a total count of each library's contributions plus a grand total of all libraries' contributions; e. tabulate, from the data compiled in (d) above, several elaborate tables of summary statistics; these statistics are described later in this paper.The number of libraries the program LNRLST can accommodate is a variable and is entered as an execution-time parameter along with the library names and code designations.The main program occupies approximately 150,000 bytes of core memory.

THE OUTPUT
A sample of the Register entries appears in Figure 2. A simple one-letter designation was used to identify each library rather than the usual National Union Catalog ( NUC) designation in order to save space in the printout.These letters appear alphabetically to the right of each LC number.A typical page of the Register contains ten columns of up to six-digit LC numbers, with the two-digit series number appearing only once at the beginning of each series.Thus each page contains about 600 LC numbers.The latest cumulation of 1,100,000 volumes ( 560,000 LC numbers) consists of nearly 1,000 pages.The entire output was produced on five pieces of fiche directly from the cumulated tape.The COM program was written by the commercial firm which contracted to run it.
The computer output microfiche was issued on five 4x6 pieces in 42X.Each piece contains 208 frames and each frame contains an average of 1,126 volumes and 573 titles.The data can be produced on 24X fiche as well as roll film.

STATISTICAL SUMMARY
The large samples of holdings (from an initial 5,000 volumes, through successive cumulations to 90,000 and, the most recent, 1,100,000) provide an excellent data base for statistical analysis.We believe the samples may be the largest title by title comparison of monographs ever tabulated in this format.Very little analysis is presented in this paper, but the data base and its format will be explained.Even without analysis, many interesting observations can be made.Most of the tabulations are designed to throw light on the various aspects of the overlap problem, since a decisive factor in determining the utility of .the Register is a knowledge of the number of titles held in common by all the libraries.Over the years there has been continuing interest in overlap.
Probably the first and most elaborate of the early studies was by Leroy Merritt ( 5), and one of the most recent by Leonard, Maier, and Dougherty (6).Continuing interest is expressed in such proclamations as that by Ellsworth Mason where he claims that materials are "being acquired in duplications that are rather staggering across the country."( 7).
The following statistics were tabulated from input for current acquisitions, the most recent being a total of 90,302 volumes, rather than the retrospective and current totals in the production runs.The 90,302 volumes were acquired for the most part during the two year period, fall 1969 to fall 1971.The statistics show holdings for sixteen libraries.

THE BASIC TABULATION-TITLES HELD IN COMMON BY UNIQUE COMBINATIONS OF LIBRARIES
The basic tabulation sections which are shown in Table 2 actually fill seven pages of computer printout.The tabulation is designed so that each unique and actual combination of libraries is separately listed, and the books held by each combination are counted.Thus, in the table, although the total number of books held in common by Libraries A and B is 127, the .52790The number of books held by Libraries A, B and Z, and no other library is 18.None of these 18 is included in the count of 52, and none of the 52 in the 18.They are mutually exclusive.But the 18, plus the 52, plus the small counts in each of the other combinations in which A and B share holdings is 127.The percentage of common holdings for each combination is also given except when the percentage is less than .01.Thus libraries A and B have .48percent in common of their total combined holdings of 10,688 volumes.
It is interesting to note that of the 65,535 possible combinations, in only 444 combinations did the percentage of common holdings exceed .01percent, and in only 8 did the percentage exceed 1 percent.Of these, th.e highest is 5.43 percent (A and Z).This 5.43 percent means that 678 of A and Z's common holdings were held by no other library.The total of A and Z's common holdings that were also held by other libraries is 1,315, or about 10.5 percent of 12,470.Again this is the highest percentage of any combination.

Summary of Titles Held in Common
The basic tabulation of titles held in common is summarized in Table 3. Column 1 is the number of libraries from 1 to 16 in each combination.Column 2 is the total number of titles counted in all combinations.For example, 59,907 titles exist in unique copy, thus there were only 59,907 copies (column 3), but there were only 8 titles which as many as 9 libraries held, for a total of 72 copies ( column 3).
Column 4 shows that all 16 libraries contributed unique titles and that there were 117 different combinations of two libraries, out of a possible 120 (column 5).Thus there were 3 combinations of 2 libraries which had no titles in common.It is also most interesting that there were only 7 combinations of 9 libraries out of a possible 11,440, and no combinations of 10 or larger.
According to the binomial distribution, there are 65,535 theoretical ways that 16 libraries can combine (total, column 5), whereas, in this sample, only 1,198 combinations occurred (total, column 4).
Column 6 is the result of column 2 divided by column 4. Thus 3774.19 is the average number of unique titles contributed by each library.74.92 is the average number held by any combination of 2 libraries, and 6.89 is the average held by any combination of 3.

SUMMARY OF EACH LIBRARY'S MULTIPLICATED TITLES
The administrators of each library are especially interested to know how many of their own titles are also held by other libraries.This information for total input (i.e., for titles with LC prefixes from 1900 to the present) is given in Table 4. (Tables were also produced giving the same kind of information by decade and for the last two years, but are not reproduced here.) The column labels are self-explanatory, but it may be observed that the total in column 5, 30,395, equals the difference between the total copies, 90,302 (column 3, table 3) and the number of titles held by one library only, 59,907 (columns 2 and 3, table 3).

DISTRIBUTION OF TITLES PUBLISHED AND MULTIPLICATED BY DECADE
Table 5 shows that the very largest overlap, in current acquisitions, occurs among books with recent imprints.This is to be expected since these figures do not make any comparison to older books recently acquired by one library to those already in another library, and since the acquisition of older books is from a much larger universe than that for current books.

OTHER SUMMARY STATISTICS
The foregoing tables illustrate the kind of tabulations that can be made with this type of data.More detailed tables can be compiled, and indeed were-e.g., tables giving the percentage of books acquired for each year and each decade for each library, with ten year totals and averages.Other possibilities would be frequency distributions and summaries for clusters of similar libraries.
This material awaits analysis.We believe it contains many heretofore unsuspected insights.

FUTURE PLANS
Since the data can be updated so readily, plans are being made to provide funds for the extraction and keypunching of LC numbers in the remaining retrospective collections of the participating libraries.These libraries contain an estimated total of two million volumes.Succeeding cumulations will be readily produced on COM.Most of the cost has been for extracting retrospective numbers from card catalogs.Once the remaining retrospective collections are cumulated, costs for cumulating current input will be negligible.
Any final catalog of course can never list complete holdings since each library has many titles without LC numbers.Those titles could be listed in more conventional form.Since they are in a minority, the expense would be far more reasonable than it would be to reproduce entire holdings in conventional form.
We have said nothing about other aspects of the project.In committee discussions, however, much has been said about the feasibility of using the LC card number to access the information in other major projects such as MARC, and possibly even the data bank in the Ohio College Library Center.Technically, it is feasible to print a conventional bibliographic catalog by matching up our LC numbers with titles listed in the current MARC tapes; pragmatically and economically, of course, it is another matter.
Other possibilities are the printing of a list of specialized holdings by accessing the subject headings on the MARC tapes, assignment of specialized acquisitions, and the gathering of information which might affect development of a joint processing center.

Fig. 1 .
Fig. 1.Flow Chart of the Programs Comprising the Regist er System .
held in common by them and no other library is only 52.

Table 1 .
The Data Record

Table 2 .
Titles Held in Common by Each Unique Combination of Libraries

Table 3 .
Summary of Titles Held in Common by Unique Combinations of Libraries (Spring 1971 tabulation)

Table 5 .
Distribution of Contributed Titles Published and Multiplicated by Decade (Titles acquired from 1969 to 1971)