lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Janssen <jans...@parc.com>
Subject Re: Keyphrase Extraction
Date Tue, 08 May 2007 04:02:14 GMT
Dawid Weiss wrote:
> You could also try splitting the document into paragraphs and use Carrot2's 
> Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters. 
> Labelling routine in Lingo should extract 'key' phrases; this analysis is 
> heavily frequency-based, but... you know, you may want to try it.

Just to make sure I'm following...

So you're suggesting splitting the document into paragraphs, then
treating each paragraph as if it were a Carrot2 search result,
performing the clustering, then looking at the label Lingo chooses for
each cluster, and treating that label as the "key phrase"?

Would DirectDocumentFeedExample be a good starting point?

Bill

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message