lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maria Vazquez <mvazq...@ova.st>
Subject Text categorization / classification
Date Wed, 27 Oct 2010 20:12:12 GMT
I need to auto-categorize a large number of documents. They are basically news articles from
major news sources (nytimes, npr, abcnews, etc).
I'd like to categorize them automatically. Any suggestions?
Lucene in Action suggests using a set of documents to build category vectors and then comparing
each document to each of those vectors and get the closest one.
The approach seems pretty simple (from other papers I read on text categorization) but maybe
you guys know of something out there that already does this using Lucene/Solr.
Thanks!
Maria

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message