lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Stor <mar...@frightanic.com>
Subject RE: Document Clustering
Date Tue, 11 Nov 2003 19:05:30 GMT
Stefan Groschupf wrote:
> Hi,
> > How is document clustering different/related to text categorization?
> 
> Clustering: try to find own categories and put documents that match
> in it. You group all documents with minimal distance together.

Would I be correct to say that you have to define a "distance threshold"
parameter in order to define when to build a new category for a certain
group?

> Classification: you have already categories and samples for
> it, that help you to match other documents.
> You calculate document distances to the existing categories
> and put it in the category with smallest distance.

Regards,
Marcel


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message