lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: number of hits of pages containing two terms
Date Tue, 17 Mar 2009 11:10:03 GMT
This is all getting very complicated!

Adrian - have you looked any further into why your original two term
query was too slow?  My experience is that simple queries are usually
extremely fast.  Standard questions: have you warmed up the searcher?
How large is the index?  How many occurrences of your first or second
terms?  Anything odd about them?  See also
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed


--
Ian.


On Tue, Mar 17, 2009 at 11:00 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
>
> Adrian Dimulescu wrote:
>
>> Thank you.
>>
>> I suppose the solution for this is to not create an index but to store
>> co-occurence frequencies at Analyzer level.
>
> I don't understand how this would address the "docFreq does
> not reflect deletions".
>
> You can use the shingles analyzer (under contrib/analyzers)
> to create and index bigrams.  (But the docFreq would still not
> reflect deletions).
>
>> Adrian.
>>
>> On Mon, Mar 16, 2009 at 11:37 AM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>>
>>> Be careful: docFreq does not take deletions into account.
>>>
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message