SHAWNEE MISSION ' S ON-LINE CATALOGING SYSTEM

An on-line cataloging pilot project for two elementary schools is discussed. The system components are 2740 terminals, upper-lower-case input, IBM's FASTER generalized software package, and usual cards/labels output. Reasons for choosing FASTER, software and hardware features, operating procedures, system performance and costs are detailed. Future expansion to cataloging 100,000 annual K-12 acquisitions, on-line circulation, retrospective conversion, and union book catalogs is set forth.


INTRODUCTION
The Shawnee Mission Public Schools, serving several affiuent suburbs of the Kansas City metropolitan area, began automated library operations in 1968.As the school districfs Computer Center was then equipped with a 1401 computer and tape/disk store, the first library system was designed for batch ordering and cataloging.Later, a batch circulation system for three of the school district's fourteen secondary libraries was started.Library automation in that period was similar to that described by Scott (1) and Auld (2).
Two years saw a profound change in the Shawnee Mission School District.By unification, it had added 50 elementary schools and a new high school, makin9 a total of 65 schools, all of which had libraries.At the school districts Computer Center, the configuration had passed through the 360/30 stage to a 360/40; 2314 disk packs were on order; and 2741 term~als, using IBM's Remote Access Computational System (RAX) had been ~stalled at all five high schools for computer science courses.
Wlule the batch library system could handle the 28.000 items ordered and cataloged annually up to that point, it was impossible to justify using it for an estimated 100,000 annual acquisitions needed by 65 libraries.
Computer time to process AUTOCODER programs on a mod/40 would be excessive; the librarians desired many improvements (upper-and lowercase I/0; longer fields; shortened time to process items; and more accurate data on cards and labels).
The need for data processing and library improvements resulted in rethinking of the approach to ordering, cataloging, and circulation.Naturally, on-line processing came to mind.IBM 274fs for computer science courses had given management and data processing staff some experience with a dedicated on-line system; the 360/40 and 2314 disks would support large files, indexed sequential file organization, and multiprogramming (simultaneous use of the CPU for terminal and batch jobs).
The experiences of Stanford and University of Chicago ( 3) and IBM ( 4) pointed out that on-line systems could be built for larger and more complex organizations than for Shawnee Mission, where the collections are 95% English language and the system covers only books and audio-visual items.Cataloging is based on title-page information; tools used are Children•s Catalog, Sears, N.U.C., A. A. Rules, and other standard works.
Also very important was the fact that the Computer Center management wanted experience in multiprogramming prior to considering it for student scheduling, student records, payroll and business functions.
A proposal was made to library and data processing management by the Library Systems Analyst in mid-December 1969 that on-line cataloging in multiprogramming mode be begun by mid-March 1970 for two elementary schools on a test basis.An improved batch order system using COBOL programs was also proposed.Finally, it was suggested that a carefully designed cataloging system could include fields to be used later for circulation control.
The specific purposes of the on-line cataloging pilot project were 1) to test whether direct access to master disk files is an efficient, accurate, and economical method of creating and updating bibliographic and holdings data; and 2) to allow data processing management to ascertain if multiprogramming is feasible and practical at this time locally.
A search of library literature revealed no on-line systems for cataloging and circulation functions; rather, either circulation or order functions were real time.Moreover, truly on-line systems were rare; Columbia had designed a circulation system that could be used in that mode, but as of October 1968, was operating batch (5).Chicago's Book Processing System does input data on line, although ordering and cataloging functions are performed off-line (6).BELLREL is an on-line circulation system (7).
Comparing the circumstances of the above institutions with that of Shawnee Mission School District brought out one sterling difference: the latter had no yant money nor huge parent institution upon which to rely.Rather, it ha a modest hardware-software configuration, a need to be operational within three months if the two test librarians were to see any output by the end of the school year, and a small team of data processors and librarians devoted to redesign and implementation.

METHODS AND MATERIALS
Having earlier seen demonstrations of the Kansas City, Missouri, Police Department's FASTER system, with its on-line access to constantly updated alphanumeric files, the senior systems analysts turned to IBM for further information.The police department's system was based on a software package developed in Alameda, California, for law enforcement.It was also available in a general form called FASTER (Filing and Source Data Entry Techniques for Easier Retrieval).The proven ability of this system to accept on-line data via 2740 terminals and to display it on 2260 CRT's, its ease of adaptation to user requirements, the quickness with which analysts and programmers had learned to use it at the police department, and a local, positive experience decided the issue.In mid-January 1970, FASTER was chosen as the software framework for on-line cataloging.

Software
FASTER has been programmed in modular form, with each module performing a particular task ( 8).Modules supporting functions that vary because of hardware must be coded by the user.This coding is done in macro form (brief program statements in higher level language which generate many machine instructions) and therefore is not a tedious task.
One of the hardest, most complicated portions of implementing a teleprocessing system is programming the support from the CPU to the terminal; with FASTER, this took about a day.The macros use Basic Teleprocessing Access Method (BTAM) support.
With line support taking little time, the user may spend more effort on his own processing needs.The user may have only a query or an update requirement; Shawnee Mission needed both.Because FASTER is a modular system, the user is permitted to describe each of his needs as a transaction.This transaction must be programmed as a TPD (Transaction Processing D~scription) using macros.Coding and listing time for a TPD will vary w1th the processing description.
Those familiar with detailed programming will note that the programmer does not have to concern himself with 1/0.The TPD will prepare the data for output, and the FASTER interface module will handle 1/0.Some of the major functions supported by the macro language include: 1) Retrieval of records from indexed sequential ( ISAM ) files-files accessed only through hierarchic indexes; 2) modifications and additions to ISAM IDes; 3) data manipulation; 4) formatting of responses to selected terminals; 5) message switching and 6) recording audit data on a logging file.FASTER under DOS requires fixed-length records; this has been modified under the OS version.
Retrieval from the !SAM files required for processing a given transaction may be performed in one of three ways: 1) retrieval of a unique record, 2) sequential retrieval of a specified number of records from a logical grouping, or 3) retrieval of a specified number of records from a logical grouping, in which the retrieval records represent the best qualified from the group based upon the user's selection criteria.

Hardware
Hardware supported by the FASTER system is as follows: Machine

File organization and access
FASTER supports !SAM files only (as data sets) with the exception of the logging file; the logging device must be sequential.FASTER's support of disk files is accomplished by using the same software modules that AL and COBOL use in maintaining !SAM files.Therefore, standard programming languages may be used for creating files and data retrieval.Shawnee Mission chose COBOL as its main fanguage and found it to complement the FASTER system.

Files
The batch library system was based on a 400-character record, repeated in its entirety for every copy in each library.This space consumption for redundant information was undesirable in a system with 65 collections, and therefore two basic files were designed.The first is the disk title file, containing one record with bibliographic information for each unique title in the school district.Its fields include author, title, dates, subject headings, annotation, etc. (Table 1).Each record is 562 characters long.In distinction, the disk copy file contains a 56-character record for each copy of a title, comprised of fixed-length fields for building number, special funding, volume information, and circulation control.Copy and title records are linked through the title number.
The third file is the ISAM title index, comprised of records with a phon.eticcode and key for each title record.This file is called up by a t~rnu~al transaction containing title; the incoming phonetic code for the htle ~~ matched with any equal ones on the index.For matches, biblio-~rap~tc data is pulled from the title and typed on the terminal.The main unction of the title index is to determine duplicates.
Tests on 45,711 title records showed that a 16-character phonetic code resulted in a maximum of 36 different titles having the same phonetic code.The 16-character code chosen consists of the first character of title followed by numeric values for most consonants.

The On-Line Cataloging System Input
Recognizing that the pilot project might be expanded into a full-scale operation, librarians drew up procedures for entering data from either shelf list cards or new arrivals.Conversion from shelf lists required that cards be edited to eliminate confusing information and to add implicit data.
For new acquisitions, most information needed by the terminal operator is annotated on the title page and its verso.A grid sheet to be slipped into the book contains subject headings, added entry, annotation and some other fields.All of these practices were set forth in a user's manual (9), along with instructions on how to enter transactions into the disk files.
Limits on the input buffer permit a maximum of 120 characters to be transmitted by any transaction, which means that several transactions are required to add all cataloging and location data necessary to describe a new title.
There are two sets of transactions.The LT series adds to and updates records in the title file; the LC series does the same for the copy file.For instance, entering a new title requires an LT01 transaction to start the record and assign a title number, one or more LT02's to complete cataloging data, and an LC01 for building assignment.Operators find transactions easy to key and understand.
By category, LC and LT transactions set up new records, add on fields, update fields, delete or activate records, and query the contents of a specified record.These transactions are a simple, understandable, and powerful method of maintaining library files.Several transactions also add data to fields, thus saving the operator keying time.For instance, the Cutter number is automatically derived from the first three letters of the author's last name, unless specifically superseded by the operator.Also, "F" is assigned to Dewey for all items unless replaced by another classification.
Finally, a standard set of output, consisting of 1) two author cards, overprinted cards, a copy card, and 2) one three-up label, is assumed when an LC01 transaction is input to show location.If other outputs are needed, the operator uses an LC05 transaction to specify them.There are also several instances of data being input in lower case (to save time and buffer space for a shift) and being edited on output to upper case.The result of all these program aids is that the operator knows she is keying only important data; highly invariable fields are input and edited by the FASTER programs, saving operator time.

Output
Two basic card formats were chosen.The unit card contains all cataloging information; the copy card shows a library's holdings of a given title (Figures 1 and 2) .A unit card and copy card (giving all cataloging and holdings information) go into the school's shell list; the usual set of main and added entry unit cards goes into the public catalog.

01
!hree-up labels for book pocket, spine, and charge card (Figure 3) are pnnted nightly, along with cards.Upon delivery from the Computer Center, the cards and labels are matched witl1 the previous day's books; labels are applied and the item made ready for delivery.The set of cards for card catalog and shell list are inserted in the item; the school librarian or staff member does local filing.The two test librarians benefitted by having their two catalogs sorted and mass printed in sequence by the computer.Naturally, the IBM 360 sort leaves some problems which require hand filing, but sort program specifications reduced those problems to a minimum.The same method of mass printing was employed to make three-up labels for each item in those collections.They were printed in shelf list sequence in order to expedite matching labels with books.

IMPLEMENTATION AND OPERATION OF SYSTEM
Operation began intermittently the last week of April.Delays in receiving terminals and minor difficulties in installing them accounted for most problems.The combination of three tutoring sessions by the senior systems analysts, instructions in the user's manual, and actual practice on the terminals resulted in skilled, confident terminal operators within two weeks.
The first shelf list, containing 5,000 titles and 5,200 copies, was completed June 4, 1970.Work began immediately on keying the 12,500 copies in the second test school; it was completed on July 28.Of the 17,700 copies in the two schools, 10,270 were unique titles.
From the end of April until June 8, two terminals were up from 12 a.m.-5 p.m. each weekday.From June 8, they operated from 8 a.m.-5 p.m. daily; as of July 13, three terminals were operating from 8 a.m.-5 p.m. each weekday.The strains on staffing and training were commensurate with this expansion rate.
Since reality is never as neat as paper plans, it follows that the decision to expand the on-line system to all K-12 cataloging was made in June, when only the first test school had been completed.That expansion began in August.Also, additional terminals for on-line circulation were ordered during the summer.The circulation system, which uses fields in the copy record, is scheduled to begin during the winter of 1970 in three high schools.Finally, two new elementary collections of 3,500 titles each were keyed in.
Therefore, the summer, which had been intended for rather leisurely completion and evaluation of the pilot project, turned into a hectic season.Fortunately, the third terminal arrived a month early, helping in the data entry race; and computer time was forthcoming for massively printing eight card catalogs and two full label sets.

DISCUSSION
The outcome of the on-line pilot project, as measured by tasks required to set up the system, time for those tasks, costs, rate of data entry, quality of input and output, deserves discussion.Perhaps most relevant to the reader is system performance, to which this section is devoted.

System Set-up Time
Overall lead time of four or so months (January through April 1970) can be broken down further; however, no close record was kept on it.
Setting up the system required the full time for four months of two senior systems analysts; the library systems analyst spent five full-time months.Two programmers were assigned for 2.5 full months each.The Head of Central Library Processing devoted one man month to design; the Head of Data Entry (terminal operators) was assigned for five months full-time to the project.
Since there were no FASTER schools available in January, an IBM engineer who had worked extensively with the Kansas City, Missouri, Police Department gave group classes for all Data Processing staff and 48 hours of individual tutorials to the senior systems analysts.
Of the two programmers, one had used COBOL and PL/1 for two years; the other was a trainee.There was no discernible difference in their ability to learn and use FASTER.The same applied to the systems analysts.
The team approach allowed the senior analysts to concentrate on programs and hardware while the library analyst was busy designing work flow, procedures, and a user manual with the librarians.After those jobs were done, the two supervising librarians were able to train terminal operators.These complementary tasks allowed each staff to concentrate on jobs efficiently rather than in a fragmented fashion.It was the ultimate responsibility of the authors to coordinate all work.

Costs
A study was made of the cost to process items in the batch and on-line systems.This included creating and keying data, printing cards and labels, and preparing items for delivery.Costs in Table 2 are based on 1) tasks performed at Central Library Processing from arrival to mailing out to destination library, 2) library and data processing staff, 3) computer time prorated at 1/5 of hourly non-district rate, and 4) supplies.A merger of two separate groups and a 5~% salary increase account for the cost growth for library staff.
Table 3 shows costs for the 28,000 items processed in fiscal year 1969-70 and for an estimated 100,000 acquisitions to be handled in fiscal year 1970-71.Duplicates cost less in each system because they bypass certain work stations.
Further cost study will be needed after the K-12 cataloging system has been operative for several months.

Rate of Data Entry
Statistics are kept for all types of transactions input daily, by terminal.Comparing these statistics with norms revealed where the system was exceeding or falling behind expectations.The norms were based on a known duplicate rate of 2:1, duplicates to new titles; and on the fact that it takes two transactions to enter duplicates, but five to enter new titles.Table 4 shows the average daily rate of 120-character transactions established for each level of daily terminal hours, and norms.Note that about one quarter of scheduled hours was lost in June and July because of unavoidable downtime.
Translated into new books and duplicates, two terminals operating 8 a.m.-5 p.m. daily could handle 775 and 1,500 items, respectively, during a week.Changes in the duplicate ratio would of course change those figures.It will be instructive to compare these data with later results.

Output
Applying labels in the pilot project was unsatisfactory on two counts: the spine label was too narrow to fit well on easy and picture books, and the weak adhesive required two coats of liquid glass on spine labels to assure some permanency.Further, actually labelling and glassing the two pilot collections showed that the latter activity consumed about three-fifths of the man-hours expended.Therefore the three-up label was redesigned, with a broader spine label and a superstick adhesive.No more liquid glass was used, cutting out an expensive task.Labelling and glassing 5,200 items took 36~ man days.This included finding the correct book, replacing pockets when necessary, adding three labels, and applying two coats of glass.
Computer time to sort, set up, and print output for 5,200 items in the first school were: 15,400 cards for public catalog (dictionary sequence )-5'48" 10,400 cards for shell list (Dewey-Cutter-author sequence )-1'30'' 5,200 3-up labels (Dewey-Cutter-author sequence )-1'35" ~oth catalogs had to be examined for cards beginning with "A", "An", and The" as well as special characters (which don't file correctly for library pu.rposes) .About five man-days were spent handling the two card catalogs pn~r to delivery.This was considered excessive, and the sort was amended to Ignore the three leading articles.Some hand filing is still needed, of course.Once successfully installed, the terminals posed no problems.Very few program bugs developed, although several changes were made in the print format of the unit and copy cards at librarians' request.A few problems developed on print programs with obsolete extents when the title file ran to two disks, but that was easily remedied.Overall, the high quality of programs and terminals is unbelievable and for this high performance, librarians are indebted to the two senior systems analysts.

Quality of Data
After having lived for two years with a system that postponed proofreading until after cards and labels had been printed, the librarians insisted on quality control prior to printing any output.This means that the original keying is carefully examined for any content or typing errors.Operators examine one another's work and key either corrections or the verifying transaction; output can be printed only after records are verified as correct.
Currently, the head of Central Library Processing examines each day's cards and labels for errors; the rate is about three mistakes for every 100 card and label sets.Out of 5,200 label sets printed for the first test library's collection, only 76 sets contained errors.
Three major factors account for the low error rate: 1) ease of identifying and correcting errors, whether on hard copy, cards, or labels; 2) program responses that tell the operator of overflowed fields, some data errors, and missing data; and 3) trained, involved operators.The result has been high quality data that can be used for a variety of printing and query purposes.

Flexibility
While the first test library was being keyed, it was decided to design an on-line circulation system to be used by three high school libraries in January 1971.Fortunately, the copy record included fields for that purpose; it will pose no major programming problem to accept new transactions from 2740 terminals.Work on that module began October 1970.
One of those three schools still has many items not in the disk title and copy files; retrospective conversion for those 12,500 items must be completed before on-line circulation can begin (circulation requires that a new label with title and copy numbers be affixed to each item).Since the techniques for retrospective conversion were developed in February for the two test schools, this step will begin with only slight procedural adjustments and no programming changes.Massively printed labels will be prepared for all 20,000 items in this school library; a public catalog and shelf list will also be sorted and printed sequentially.This means that the burden of having shelf list cards thoroughly edited prior to keying is offset by Central Library Processing staff inputting the data, by computer programs to sort and print labels and catalogs, and by using tested and revamped procedures.
The pilot project system designed for cataloging data is flexible.Although specifically planned for two elementary schools, it was designed to handle all cataloging input for 65 K-12libraries.Originally intended for cataloging data and output, it includes circulation fields that will soon be used.It was initially created for two terminals, four are now operating, and as many as 30 could be added with slight hardware or program modifications.Currently operating on a 360/40, it could also be run on a 360/50.The FASTER library system operates in a core partition, allowing batch and other jobs to be processed simultaneously.Finally, it is understood and operated at many levels by analysts, programmers, librarians, and clerks.
The ease with which FASTER was learned by the data processing staff, the speed with which the system was built, its growth capability, the additional functions that can be hung onto it, and the confidence that librarians have in its usefulness testify to its simplicity and power.

CONCLUSION
Were the specific purposes of the on-line cataloging pilot project met?The librarians are satisfied that direct access to master library disk files is an efficient, accurate, and economical method of creating and updating bibliographic and holdings information.Terminals provide an easy way to add to and change files instantly; the FASTER programs speedily manipulate and output desired data; the system has worked dependably since its first days; the ISAM title index provides undreamed-of indepen• dence from listings and card catalogs; and the work How has been stripped of time-consuming steps.
The data processing management has found multiprocessing feasible and practical.Work is underway now to build on-line budget and payroll systems, using terminals and FASTER programs.APL, the student problemsolving system, was installed in the high schools in September and operates in a core partition.The 360/40, formerly dedicated to performing one job at a time, now supports three activities-FASTER, APL, and batch jobssimultaneously.This economy is important to data processing managers.
Of equal interest to library administrators is Shawnee Mission's successful use of an existing software package.Using IBM's FASTER software framework freed programmers from tedious 1/0 control; instead, they devoted their efforts to writing instructions for adding to, updating, and deleting from library disk files.
As Fussier noted in October, 1968, "There is ... an absence of certain badly needed general data management software packages to provide file organization, update, and retrieval capabilities desirable in library processing operations.Existing systems are considered prohibitively expensive in cost and core dedication requirements, and may demand total dedication of a time-shared machine for the data management activities.( 10 r Most libraries cannot afford to write their own software for on-line systems; packages like FASTER offer opportunities for real-time file maintenance that heretofore were the preserve of large, well-financed institutions. Designed for teleprocessing and multiprogramming, FASTER is ideally suited to installations now doing batch processing with a medium-size computer (such as the 360/40, 128 K core) but desiring partitioned operations.
The amount of traffic that a communications system will bear is of prime concern to the user.Some of the factors that affect traffic rates within FASTER are I) number and type of terminal (line speed), 2) number of 1/0 queue buffers, 3) amount of updating and retrieval per transaction, 4) other jobs being processed by the CPU, and 5) CPU speed capacity.
Shawnee Mission found that simulation to determine traffic loads would be very difficult because an existing teleprocessing system ( APL) runs simultaneously.Consultation with IBM and examination of the Kansas City Police Department's experience, led to the decision that FASTER would easily handle the estimated daily peak of 3000 transactions now needed for file maintenance and query associated with cataloging and circulation control.
Six months of operations have shown that traffic capacity is more limited by disk 1/0 than by any other factor.On the average, 1.25 seconds is required for the CPU to process a transaction.For four terminals, a total of 10 seconds elapses between hitting the bid key and receiving a reply; that encompasses terminal and line transmission, queuing delays, and transaction processing.

Table 1 .
Main Fields Input by Operators , , For name or title 2 If other than general funds 3 For volume or other sequence number 16 Kept only until labels and cards printed

Table 4 .
Daily Rate of Data Entry vs. Norms

Table 2 .
Overall Costs for Batch and Terminal Cataloging Systems

Table 3 .
Unit Costs for Processing Items in Two Automated Systems