mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Canopy estimator
Date Sat, 12 May 2012 19:11:42 GMT
Yes.  It may help with variable scale.

The class technique for dealing with that is to cluster with a small number
of clusters at a gross level and then cluster each set of documents that
belong to a single large cluster.   This automatically adapts to different
scales.

The new stuff would greatly facilitate your experimentation.

On Sat, May 12, 2012 at 11:19 AM, Pat Ferrel <pat@occamsmachete.com> wrote:

> If you are asking about using your post 0.7 clustering, no I haven't yet.
> Will it help with varying scale? I assume by scale you mean the density of
> docs in certain areas of the  vector space? One thing I am trying now is
> limiting the subject matter crawled and getting a much larger sample, which
> should get me a denser distribution.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message