lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tate Avery" <tate.av...@nstein.com>
Subject RE: Document Clustering
Date Tue, 11 Nov 2003 15:58:49 GMT
Categorization typically assigns documents to a node in a pre-defined taxonomy.

For clustering, however, the categorization 'structure' is emergent... i.e. the clusters (which
are analogous to taxonomy nodes) are created dynamically based on the content of the documents
at hand.


-----Original Message-----
From: petite_abeille [mailto:petite_abeille@mac.com]
Sent: Tuesday, November 11, 2003 10:50 AM
To: Lucene Users List
Subject: Re: Document Clustering


Hi Otis,

On Nov 11, 2003, at 16:41, Otis Gospodnetic wrote:

> How is document clustering different/related to text categorization?

Not that I'm an expert in any of this, but clustering is a much more 
"holistic" approach than categorization. Usually, categorization is 
understood as a more precise endeavor (e.g. dmoz.org), while clustering 
is much more "fuzzy" and non-deterministic. Both try to achieve the 
same goal though. So perhaps this is just a question of jargon.

I'm confident that the owner of this site could help bring some light 
on the finer point of clustering vs categorization:

http://www.lissus.com/resources/index.htm

Cheers,

PA.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message