lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Nauli" <andy.na...@utoronto.ca>
Subject Re: lucene for statistical analysis
Date Fri, 02 May 2003 17:29:24 GMT
Thanks Julien and Leo,

Based on your respond, I think Lucene should
provide what I need.

I don't think that I will need to store the statistical
data with the index, as long as I can compute them,
it will be fine...

Thanks again
Andy
----- Original Message ----- 
From: "Julien Nioche" <Julien.Nioche@lingway.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, May 02, 2003 5:03 AM
Subject: Re: lucene for statistical analysis


> Creating an index for Lucene is indeed a good idea ;-)
>
> It's very easy to retrieve informations about the most frequent Terms in
the
> index and the frequency of a given Term.
> (e.g. using  IndexReader.termDocs(Term term))
>
> But there's currently no method in the API to get the frequency of a
> PhraseQuery. There was a discussion about that particular point a long
time
> ago (see
> http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html).
> This is also in the list of future improvments
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18932.
>
> I implemented it, but in a old version of Lucene. Because of the
> modifications made in the Scoring recently it has  to be redone. The
problem
> is that computing the frequency of a PhraseQuery takes a lot of time (in a
> regular search as well).
>
> If you don't need frequencies for PhraseQueries - Lucene is a good
> solution.Otherwise changes must be done in Lucene.
>
> I'll try to take a look at it soon and propose a patch to the core
sources.
>
>
> ----- Original Message -----
> From: "Andy Nauli" <andy.nauli@utoronto.ca>
> To: <lucene-user@jakarta.apache.org>
> Sent: Friday, May 02, 2003 11:33 AM
> Subject: lucene for statistical analysis
>
>
> > hello,
> >
> > I am just starting looking at lucene for my project.
> >
> > Before I proceed, I would like to know if it's a good idea to use lucene
> for
> > creating index and also performing statistical analysis on the index
(e.g.
> > most frequent words, number of appearance of certain index token, etc.)
> >
> > if lucene is not a good candidate, can anyone suggest an alternatives ?
> >
> > thanks in advance
> > andy
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message