Library Cataloging
Automatic Classification
Managing Content with Automatic Document Classification by Rafael A. Calvo, Jae-Moon Lee and Xiaobo Li appears in the latest
Journal of Digital Information, vol. 5, no. 2.
The paper describes how machine learning and automatic document classification techniques can be used for managing large numbers of news articles, or Web page descriptions, lightening the load on domain experts. The paper uses two datasets, one with with more than 800,000 Reuters news stories and another with over 41,000 Web sites, and classifies them using a Naïve Bayes algorithm, into predefined categories. We discuss the different parameters and design decisions that normally appear when building automatic classifiers, including, stemming, stop-words, thresholding, amount of data and approaches for improving performance using the structure in XML documents. The methodology developed would enable Web based applications or workflow systems to manage information more efficiently, i.e. by assigning documents to topics automatically or assisting humans in the process of doing so.
-
Information Design Models And Processes
Journal of Digital Information announces a special issue on Information Design Models and Processes (Volume 5, issue 2, August 2004). Atricles include:R. Calvo, J. Lee, X. Li Managing Content with Automatic Document ClassificationX. Kong, L. Liu, D. Lowe...
-
Classification
Dynamic and hierarchical classification of Web pages by Ben Choi; Xiaogang Peng appears in Online Information Review (2004) v. 28, no. 2, pp. 139-147.Automatic classification of Web pages is an effective way to organise the vast amount of information...
-
Classification
Automatic Classification: Moving to the Mainstream by Robert Blumberg and Shaku Atre in DM Review Magazine.Together, enterprise search and classification provide an initial response to the information explosion. Classification complements search by enabling...
-
Subject Access
Auto-Categorization: Coming to a Library or Intranet Near You! by Tom Reamy.Simply put, automatic-categorization is a new type of software that assigns documents into subject matter categories based on a wide variety of techniques. These techniques include...
-
Classification
Open Source classification software form OCLC.The Scorpion Open Source project offers software that implements a system for automatically classifying Web-accessible text documents. Scorpion is intended for use by investigators who have a machine-readable...
Library Cataloging