lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From manjula wijewickrema <manjul...@gmail.com>
Subject Re: Term/Phrase frequencies
Date Fri, 07 May 2010 03:48:02 GMT
Hi Erik,

Thanks for the reply. What I want to do is, to identify key terms and key
phrases of a document according to their number of occurences in the
document. Output should be the highest freequency words and (two or three
word) phrases. For this purpose can I use Lucene?

Thanks
Manjula

On Thu, May 6, 2010 at 6:09 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Terms are relatively easy, see TermFreqVector in the JavaDocs.
>
> Phrases aren't as easy, before you go there, though, what is the
> high-level problem you're trying to solve? Possibly this is an XY problem
> (see http://people.apache.org/~hossman/#xyproblem).
>
> Best
> Erick
>
> On Thu, May 6, 2010 at 6:39 AM, manjula wijewickrema <manjula53@gmail.com
> >wrote:
>
> > Hi,
> >
> > I am new to Lucene. If I want to know the term or phrase frequency of an
> > input document, will it be possible through Lucene?
> >
> > Thanks,
> > Manjula
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message