Application of the Variety-Generator Approach to Searches of Personal Names in Bibliographic Data Bases--Part 1. Microstructure of Personal Authors' Names

Dirk W. Fokker, Michael F. Lynch


Conventional approaches to processing records of linguistic origin for storage and retrieval tend to regard the data as immutable. The data generally exhibit great variety and disparate frequency distributions, which are largely ignored and which entail either the storage of extensive lists of items or the use of complex numerical algorithms such as hash coding. The results in each case are far from ideal.

The variety-generator approach seeks to reflect the microstructure of data elements in their description for storage and search, and takes advantage of the consistency of statistical characteristics of data elements in homogeneous data bases.

In this paper, the application of the variety-generator approach to the description of personal author names from the INSPEC data base by means of small sets of keys is detailed. It is shown that high degrees of partitioning of names can be obtained by key-sets generated from the initial characters of surnames, fmm the terminal characters of surnames, and from the initials.

The implications of the findings for computer-based bibliographical informationsystems are discussed.

Full Text:




  • There are currently no refbacks.

Copyright (c) 2015 Information Technology and Libraries

License URL:



SCImago Journal & Country Rank data for ITAL