lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Performance measurements
Date Wed, 24 Jul 2013 22:05:52 GMT
I think I've exhausted my expertise in Lucene filters, but I think you can 
wrap a query with a filter and also wrap a filter with a query. So, for 
IndexSearcher.search, you could take a filter and wrap it with 
ConstantScoreQuery. So, if a BooleanQuery got wrapped as a filter, it could 
be wrapped as a CSQ for search so that no scoring would be done.

-- Jack Krupansky

-----Original Message----- 
From: Sriram Sankar
Sent: Wednesday, July 24, 2013 3:58 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements

On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky 
<jack@basetechnology.com>wrote:

> Unicorn sounds like it was optimized for graph search. Specialized search
> engines can in fact beat out generalized search engines for specific use
> cases.
>

Yes and no (I worked on it).  Yes, there are many aspect of Unicorn that
have been optimized for graph search.  But the tests I am running have very
little to do with those optimizations.  I am still learning about Lucene
and have suspected that the scoring framework (that has to be very general)
may be contributing to the performance issues.  With Unicorn, we made a
decision to do all scoring after retrieval and not during retrieval.


>
> Scoring has been a major focus of Lucene. Non-scored filters are also
> available, but the query parsers are focused (exclusively) on 
> scored-search.
>

When you say "filter" do you mean a step performed after retrieval?  Or is
it yet another retrieval operation?


>
> As Adrien indicates, try using raw Lucene filters and you should get much
> better results. Whether even that will compete with a use-case-specific
> (graph) search engine remains to be seen.


Thanks (I will study this more).

Sriram.



>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sriram Sankar
> Sent: Wednesday, July 24, 2013 1:03 PM
> To: java-user@lucene.apache.org
> Subject: Re: Performance measurements
>
>
> No I do not need scoring.  This is a pure retrieval query - which matches
> what we used to do with Unicorn in Facebook - something like:
>
> (name:sriram AND (friend:1 OR friend:2 ...))
>
> This automatically gives us second degree.
>
> With Unicorn, we would always get sub-millisecond performance even for
> n>500.
>
> Should I assume that Lucene is that much worse - or is it that this use
> case has not been optimized?
>
> Sriram.
>
>
>
> On Wed, Jul 24, 2013 at 9:59 AM, Adrien Grand <jpountz@gmail.com> wrote:
>
>  Hi,
>>
>> On Wed, Jul 24, 2013 at 6:11 PM, Sriram Sankar <sankar@gmail.com> wrote:
>> > termA AND (termB1 OR termB2 OR ... OR termBn)
>>
>> Maybe this comment is not appropriate for your use-case, but if you
>> don't actually need scoring from the disjunction on the right of the
>> query, a TermsFilter will be faster when n gets large.
>>
>> --
>> Adrien
>>
>> ------------------------------**------------------------------**---------
>> To unsubscribe, e-mail: 
>> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
>> For additional commands, e-mail: 
>> java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>>
>>
>>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message