lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "José Ramón Pérez Agüera" <jose.agu...@gmail.com>
Subject Re: Keyphrase Extraction
Date Tue, 08 May 2007 11:14:45 GMT
here you have a very good tool for Keyphrase Extraction. It is GNU and
easy to integrate in Lucene.

http://www.paynter.info/academia/Kea.php

best

jose

On 5/8/07, Bill Janssen <janssen@parc.com> wrote:
> Dawid Weiss wrote:
> > You could also try splitting the document into paragraphs and use Carrot2's
> > Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters.
> > Labelling routine in Lingo should extract 'key' phrases; this analysis is
> > heavily frequency-based, but... you know, you may want to try it.
>
> Just to make sure I'm following...
>
> So you're suggesting splitting the document into paragraphs, then
> treating each paragraph as if it were a Carrot2 search result,
> performing the clustering, then looking at the label Lingo chooses for
> each cluster, and treating that label as the "key phrase"?
>
> Would DirectDocumentFeedExample be a good starting point?
>
> Bill
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message