lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andre Rubin" <andre.ru...@gmail.com>
Subject Re: Performance, yet again
Date Tue, 02 Sep 2008 18:18:53 GMT
I've tested ConstantScorePrefixQuery and it hit right in the head. It's now
mind-boggling fast! Even a query that has 200.000 matches was under 0.5
seconds!

Thanks! :))


Andre


On Tue, Sep 2, 2008 at 10:44 AM, Mark Miller <markrmiller@gmail.com> wrote:

> Andre Rubin wrote:
>
>> On Tue, Sep 2, 2008 at 10:16 AM, Mark Miller <markrmiller@gmail.com>
>> wrote:
>>
>>
>>
>>> Andre Rubin wrote:
>>>
>>>
>>>
>>>> Hi all,
>>>>
>>>> Most of our queries are very simple, of the type:
>>>>
>>>> Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix));
>>>> Hits hits = searcher.search(query, new Sort(new SortField(LABEL_FIELD)))
>>>>
>>>>
>>>>
>>>>
>>> You might want to check out solrs ConstantScorePrefixQuery and compare
>>> performance.
>>>
>>>
>>
>>
>> I'm not familiar with Solrs. It is not standard Lucene, is it?
>>
>>
> Sorry about that. Solr is a search server that is a sub project of the
> Lucene Apache project. You can just copy the Query from solrs source code
> and use it with Lucene.  ConstantScorePrefixQuery may be faster for you than
> PrefixQuery and it doesn't have MaxClause exceptions issues when your prefix
> matches too many terms in the index. Please report back the speed difference
> if you can.
>
> http://lucene.apache.org/solr/
>
>>
>>
>>
>>>  Which sometimes result in 10, 20, sometimes 40 thousand hits.
>>>
>>>
>>>> I get good performance if hits.length is 20.000 or less (less than 0.5
>>>> seconds). I However, if it is 40.000 or more, querying takes over a
>>>> second,
>>>> up to 2.5 seconds. Point in check here is that this solution is not
>>>> scaling.
>>>> Any ideas I can try?
>>>>
>>>> I already exhausted the ideas from http://wiki.apache.org/lucene
>>>> -java/ImproveSearchingSpeed
>>>>
>>>> I was reading about TopDocs and TopFieldDocs. Is this search method
>>>> (using
>>>> TopDocs) preferred over Hits? Also, there's no constructor for them
>>>> without
>>>> a Filter, can I just pass null?
>>>>
>>>>
>>>>
>>>>
>>> It is preferred over Hits. Hits has been deprecated and you should really
>>> migrate away from it.
>>>
>>>
>>
>>
>> I was trying, before, to use it, but it doesn't seem as straightfoward as
>> Hits. Is there an example code, somewhere?
>>
>>
> I think work was done on this when Hits was deprecated. Anyone know?
>
>>
>>
>>
>>>  Is it possible to pre-sort the index, so I don't have to every time I
>>>
>>>
>>>> perform a query?
>>>>
>>>> Any other ideas?
>>>>
>>>>
>>>>
>>>>
>>> I think in general, sorting and prefix query can be slower operations in
>>> Lucene (though sorting is generally pretty fast after the field caches
>>> are
>>> loaded). You might try the first couple suggestions there though, and
>>> others
>>> may fill on other steps you can take as well.
>>>
>>> - Mark
>>>
>>>
>>>
>>
>>
>> Thanks, Mark.
>>
>>
>> Andre
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message