Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art

  • Jung-ran Park Drexel University
  • Andrew Brenza Drexel University


Assessment of the current landscape of semi-automatic metadata generation tools is particularly important considering the rapid development of digital repositories and the recent explosion of big data. Utilization of (semi)automatic metadata generation is critical in addressing these environmental changes and may be unavoidable in the future considering the costly and complex operation of manual metadata creation. To address such needs, this study examines the range of semi-automatic metadata generation tools (n=39) while providing an analysis of their techniques, features, and functions. The study focuses on open-source tools that can be readily utilized in libraries and other memory institutions.  The challenges and current barriers to implementation of these tools were identified. The greatest area of difficulty lies in the fact that  the piecemeal development of most semi-automatic generation tools only addresses part of the issue of semi-automatic metadata generation, providing solutions to one or a few metadata elements but not the full range elements.  This indicates that significant local efforts will be required to integrate the various tools into a coherent set of a working whole.  Suggestions toward such efforts are presented for future developments that may assist information professionals with incorporation of semi-automatic tools within their daily workflows.

Author Biographies

Jung-ran Park, Drexel University

Dr. Jung-ran Park

Editor, Journal of Library Metadata

Associate Professor

The College of Computing and Informatics

Drexel University
Andrew Brenza, Drexel University

Project assistant

College of Computing and Informatics



Choudhury, G., T. DiLauro, M. Patton and D. Reynolds. “Toward a Metadata Generation

Framework: A Case Study at Johns Hopkins University.” D-Lib Magazine 10.11 (2004).

Accessed January 13, 2015,


Dublin Core Metadata Initiative (DCMI). “Tools and software.” Accessed on January 13, 2015,


Erbs, N., I. Gurevych and M. Rittberger. “Bringing Order to Digital Libraries: From Keyphrase

Extraction to Index Term Assignment.” D-Lib Magazine 19.9/10 (2013). Assessed on

January 13, 2015, http://www.dlib.org/dlib/september13/erbs/09erbs.html

Gardner, S. “Cresting Toward the Sea Change.” Library Resources & Technical Services 56, no.

(2012): 64-79.

Greenberg, J. “Metadata Extraction and Harvesting: A Comparison of Two Automatic

Metadata Generation Applications.” Journal of Internet Cataloging 6, no. 4 (2004):


Greenberg, J., K. Spurgin and A. Crystal. “Final Report for the AMeGA (Automatic

Metadata Generation Applications) Project.” Library of Congress. Accessed on

January 13, 2015, http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf

JSTOR. “JHove Homepage.” Accessed on January 13, 2015, http://jhove.sourceforge.net

Kea Automatic Keyphrase Extraction. “Homepage.” Accessed on January 13, 2015,


Kovacevic, A., D. Ivanovic, B. Milosavljevic, Z. Konjovic and D. Surla. “Automatic Extraction

of Metadata from Scientific Publications for CRIS Systems.” Electronic Library and

Information Systems 45, no. 4 (2011): 376-396.

Kurtz, M. “Dublin Core, DSpace, and a Brief Analysis of Three University Repositories.”

Information Technology and Libraries 29, no. 1 (2010): 40-46.

Leibbrandt, R., D. Yang, D. Pfitzner, D. Powers and P. Mitchell. “Smart Collections: Can

Artificial Intelligence Tools and Techniques Assist with Discovering, Evaluating and

Tagging Digital Learning Resources?” International Association of School

Librarianship: Selected Papers from the Annual Conference (2010): 1-12.

Lindstaedt, S., R. Mörzinger, R. Sorschag, V. Pammer and G. Thallinger. “Automatic Image

Annotation Using Visual Content and Folksonomies.” Multimedia Tools and

Applications 42, no. 1 (2009): 97-113.

Liu, X. and J. Qin. “An Interactive Metadata Model for Structural, Descriptive, and Referential

Representation of Scholarly Output.” Journal of the Association for Information Science & Technology 65, no. 5 (2014): 964-983.

Miller, L., L. Soh, A. Samal and G. Nugent. “iLOG: A Framework for Automatic Annotation of

Learning Objects with Empirical Usage Metadata.” International Journal of Artificial

Intelligence in Education 21, no. 3 (2012): 215-236.

Mitchell, E. “Trending Tech Services: Programmatic Tools and the Implications of Automation

in the Next Generation of Metadata.” Technical Services Quarterly 30, no. 3 (2013): 296-

Park, J. and C. Lu. “Application of Semi-Automatic Metadata Generation in Libraries: Types,

Tools, and Techniques.” Library & Information Science Research 31 (2009): 225-231.

Polfreman, M., V. Broughton and A. Wilson. “Metadata Generation for Resource Discovery.”

JISC (2008). Accessed on January 13, 2015,


Randtke, W. “Automated Metadata Creation: Possibilities and Pitfalls.” Serials Librarian 64, no.

-4 (2013): 267-284.

Vellucci, S., I. Hsieh-Yee and W. Moen. “The Metadata Education and Research Information

Commons (MERIC): A Collaborative Teaching and Research Initiative.” Education for Information 25, no. 3/4 (2007): 169-178.

How to Cite
Park, J.- ran, & Brenza, A. (2015). Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art. Information Technology and Libraries, 34(3), 22-42. https://doi.org/10.6017/ital.v34i3.5889