A Text Categorization Technique based on a Numerical Conversion of a Symbolic Expression and an Onion Layers Algorithm

Authors

  • Marios Poulos Department of Archives and Libraries Science, University of Ionian
  • Sozon Papavlasopoulos Department of Archives and Libraries Science, University of Ionian
  • Vasilios Chrissikopoulos Department of Archives and Libraries Science, University of Ionian

Abstract

The dramatic increase in the amount of content available in digital forms gives rise to large-scale digital libraries, targeted at millions of users. As a result, it has become a necessary to categorize large texts (documents). The paper develops a novel method where text categorization is achieved via a reduction in the original data information using numerical conversion of a symbolic expression and an onion layers algorithm. Three different semantic categories were considered and five texts selected from each category for submission to a text categorization procedure using the proposed method. The results and the statistical evaluation of this procedure showed that the proposed method may be characterized as highly accurate for text categorization purposes.

Downloads

Published

2006-02-01