mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: newbie question: LSA anaylsis + others
Date Wed, 17 Jun 2009 02:21:00 GMT
As yet, no.

There is, however, an active project going to implement LDA.  This will give
you "semantic" representations for words which could then be clustered.  We
do have several clustering algorithms that would be entirely sufficient for
that step.

It would be a very interesting addition to have some other language modeling
implementation as well.  The one that I would find interesting would be
something like a center embedded neural model such as was used in this
article by Ronan
Collobert<http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf>.
I would be very willing to advise on the implementation of such a beast, but
due to my normal level of over-commitment could only provide minimal direct
code contributions.

On Tue, Jun 16, 2009 at 7:04 PM, Paul Jones <paul_jonez99@yahoo.co.uk>wrote:

> 1. Take a set of words
> 2. Build clusters of these words, i.e work out the semantic relationship
> between these (I guess I could use wordnet as a starter) words. i.e
> inter-relationships
> 3. Once clusters have been formed of words, also work out relationship
> between the clusters themselves.
>
> so in essence I could work out that red was similiar to crimson, and hence
> a search on red would produce docs with crimson in them even though red was
> not mentioned.
>
> would mahout work here?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message