One of the clustering algorithms has a patch that should have some at-least-ok key phrase extraction. Shashi was digging into that. On Tue, Sep 22, 2009 at 11:54 PM, Isabel Drost wrote: > On Sun, 20 Sep 2009 15:58:05 -0700 (PDT) > jakobitsch juergen wrote: > > > - could someone point me into the right direction for basic intro > > into keyphrase extraction using mahout > > So far I am not aware of people using Mahout for key-phrase extraction. > I am not myself familiar with kea but know of people on-list having > experience with kea. Would be interesting to read their comments as > well. > > There are a few publications available online that deal with training a > classifier to identify key-phrases. Basically the general approach > would be to first manually label a set of "good" and "bad" phrases. > Next one would have to come up with features describing these phrases > and training a classifier on this example set. > > Mahout does come with a set of classification algorithms, but has no > full-featured end-to-end solution for the problem of key-phrase > identification. You would still need to manually label example phrases, > come up with a good feature set and integrate the Mahout training > algorithms into your software. > > Isabel > -- Ted Dunning, CTO DeepDyve