nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chad savage <csav...@activeathletemedia.com>
Subject classifying content
Date Tue, 05 Dec 2006 06:01:42 GMT
Hello All,

I'm doing some research on how to classify documents into pre-defined 
categories.
Some methods I have come across are Ontologies, topic maps, url/site 
based and simple keyword analysis.
I'm leaning towards topic maps and Ontologies being the strongest and 
most documented in theory and in practice.
Does the group have any recommendations on where to start?
Software packages to help develop the owl/rdf files? Protoge?
Any consultancies out there that handle this process?
Downfalls to using these?
And finally, integrating them into nutch/lucene.

Thanks in advance,
Chad


Mime
View raw message