lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Performance, yet again
Date Tue, 02 Sep 2008 17:44:44 GMT
Andre Rubin wrote:
> On Tue, Sep 2, 2008 at 10:16 AM, Mark Miller <markrmiller@gmail.com> wrote:
>
>   
>> Andre Rubin wrote:
>>
>>     
>>> Hi all,
>>>
>>> Most of our queries are very simple, of the type:
>>>
>>> Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix));
>>> Hits hits = searcher.search(query, new Sort(new SortField(LABEL_FIELD)))
>>>
>>>
>>>       
>> You might want to check out solrs ConstantScorePrefixQuery and compare
>> performance.
>>     
>
>
> I'm not familiar with Solrs. It is not standard Lucene, is it?
>   
Sorry about that. Solr is a search server that is a sub project of the 
Lucene Apache project. You can just copy the Query from solrs source 
code and use it with Lucene.  ConstantScorePrefixQuery may be faster for 
you than PrefixQuery and it doesn't have MaxClause exceptions issues 
when your prefix matches too many terms in the index. Please report back 
the speed difference if you can.

http://lucene.apache.org/solr/
>
>   
>>  Which sometimes result in 10, 20, sometimes 40 thousand hits.
>>     
>>> I get good performance if hits.length is 20.000 or less (less than 0.5
>>> seconds). I However, if it is 40.000 or more, querying takes over a
>>> second,
>>> up to 2.5 seconds. Point in check here is that this solution is not
>>> scaling.
>>> Any ideas I can try?
>>>
>>> I already exhausted the ideas from http://wiki.apache.org/lucene
>>> -java/ImproveSearchingSpeed
>>>
>>> I was reading about TopDocs and TopFieldDocs. Is this search method (using
>>> TopDocs) preferred over Hits? Also, there's no constructor for them
>>> without
>>> a Filter, can I just pass null?
>>>
>>>
>>>       
>> It is preferred over Hits. Hits has been deprecated and you should really
>> migrate away from it.
>>     
>
>
> I was trying, before, to use it, but it doesn't seem as straightfoward as
> Hits. Is there an example code, somewhere?
>   
I think work was done on this when Hits was deprecated. Anyone know?
>
>   
>>  Is it possible to pre-sort the index, so I don't have to every time I
>>     
>>> perform a query?
>>>
>>> Any other ideas?
>>>
>>>
>>>       
>> I think in general, sorting and prefix query can be slower operations in
>> Lucene (though sorting is generally pretty fast after the field caches are
>> loaded). You might try the first couple suggestions there though, and others
>> may fill on other steps you can take as well.
>>
>> - Mark
>>
>>     
>
>
> Thanks, Mark.
>
>
> Andre
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message