A Simple Scheme for Book Classification Using Wikipedia
Because the rate at which documents are being generated outstrips librarians’ ability to catalog them, an accurate, automated scheme of subject classification is desirable. However, simplistic word-counting schemes miss many important concepts; librarians must enrich algorithms with background knowledge to escape basic problems such as polysemy and synonymy. I have developed a script that uses Wikipedia as context for analyzing the subjects of nonfiction books. Though a simple method built quickly from freely available parts, it is partially successful, suggesting the promise of such an approach for future research.
Copyright (c) Andromeda Yelton
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors that submit to Information Technology and Libraries agree to the Copyright Notice.