mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Large scale social network clustering w/ LDA
Date Sun, 18 Sep 2011 19:10:15 GMT
Mahout LDA is not widely used.  I doubt that this scale is a problem with
LDA in general.

Right now for performance and use in production, I would recommend Yahoo's
LDA.   see
http://blog.smola.org/post/6359713161/speeding-up-latent-dirichlet-allocation

Mahout's SVD codes can handle this size problem easily, but that isn't at
all the same as LDA ... just kind of the same shape.


On Sun, Sep 18, 2011 at 4:53 AM, Timmy Wilson <timmyt@smarttypes.org> wrote:

> Hi,
>
> I'm considering using LDA to cluster a large social graph.
>
> Users are documents, relationships are terms --
> http://www.machinedlearnings.com/2011/03/lda-on-social-graph.html
>
> I want to scale to 25M documents @ 200terms/doc (on average).
>
> Is this reasonable, are there examples of LDA usage @ this scale?
>
> Thanks,
> Timmy Wilson
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message