lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche" <Julien.Nio...@lingway.com>
Subject Re: lucene for statistical analysis
Date Fri, 02 May 2003 10:03:35 GMT
Creating an index for Lucene is indeed a good idea ;-)

It's very easy to retrieve informations about the most frequent Terms in the
index and the frequency of a given Term.
(e.g. using  IndexReader.termDocs(Term term))

But there's currently no method in the API to get the frequency of a
PhraseQuery. There was a discussion about that particular point a long time
ago (see
http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html).
This is also in the list of future improvments
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18932.

I implemented it, but in a old version of Lucene. Because of the
modifications made in the Scoring recently it has  to be redone. The problem
is that computing the frequency of a PhraseQuery takes a lot of time (in a
regular search as well).

If you don't need frequencies for PhraseQueries - Lucene is a good
solution.Otherwise changes must be done in Lucene.

I'll try to take a look at it soon and propose a patch to the core sources.


----- Original Message -----
From: "Andy Nauli" <andy.nauli@utoronto.ca>
To: <lucene-user@jakarta.apache.org>
Sent: Friday, May 02, 2003 11:33 AM
Subject: lucene for statistical analysis


> hello,
>
> I am just starting looking at lucene for my project.
>
> Before I proceed, I would like to know if it's a good idea to use lucene
for
> creating index and also performing statistical analysis on the index (e.g.
> most frequent words, number of appearance of certain index token, etc.)
>
> if lucene is not a good candidate, can anyone suggest an alternatives ?
>
> thanks in advance
> andy
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message