jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Stocker <christian.stoc...@liip.ch>
Subject Re: Strange Search Performance problem with OR
Date Tue, 27 Mar 2012 07:55:36 GMT
Hi

On 27.03.12 09:49, David Buchmann wrote:
> sorry, my bad. did not read correctly.
> you do have the paranthesis so you did what you wanted to do.
> 
> looks like lucene/jackrabbit combine the 2 datasets first and filter
> later...
> 
> what if you try
> 
> 
> SELECT * FROM [own:unstructured] AS data
> WHERE
>     data.guid = 'J7B1X' AND ISDESCENDANTNODE(data, '/article')
>   OR
>     data.guid = 'J7B1X' AND ISDESCENDANTNODE(data, '/import/article')
> ORDER BY firstImportDate DESC

I tried that and I tried it again now. Same response time as the
original query.

Any hints from someone who knows the internal workings of jackrabbit/lucene?

chregu

> 
> if this is fast, then the jackrabbit query engine is not very clever...
> 
> cheers,david
> 
> 
> Am 27.03.2012 09:10, schrieb David Buchmann:
>> i think the 2 queries are not equivalent. the first one is equivalent to
> 
>> ...
>> WHERE data.guid = 'J7B1X'
>>   AND (ISDESCENDANTNODE(data, '/article')
> 
>> plus
> 
>> WHERE
>>  ISDESCENDANTNODE(data, '/import/article')
> 
>> (if you want the data.guid = ... to apply to both, you need paranthesis)
> 
>> but if /import/article is almost empty, i still don't see why the
>> combined query should take so long unless jackrabbit/lucene are doing
>> something stupid.
> 
>> cheers,david
> 
>> Am 26.03.2012 22:28, schrieb Christian Stocker:
>>> Hi
> 
>>> We have the following search query
> 
> 
>>> SELECT * FROM [own:unstructured] AS data WHERE data.guid = 'J7B1X'
>>> 		AND (ISDESCENDANTNODE(data, '/article')
>>> 		OR ISDESCENDANTNODE(data, '/import/article')
>>> 		)
>>> 		ORDER BY firstImportDate DESC
> 
> 
>>> This query can take quite some time (up to 3 seconds, but it gets more
>>> and more hte more data we have). In /article there's potentially a lot
>>> of nodes, in /import/article usually almost nil.
> 
> 
>>> If we now separate the query into 2:
> 
>>> SELECT * FROM [own:unstructured] AS data WHERE data.guid = 'J7B1X'
>>> 		AND ISDESCENDANTNODE(data, '/article')
>>> 		ORDER BY firstImportDate DESC
> 
>>> and
> 
>>> SELECT * FROM [own:unstructured] AS data WHERE data.guid = 'J7B1X'
>>> 		AND ISDESCENDANTNODE(data, '/import/article')
>>> 		ORDER BY firstImportDate DESC
> 
>>> Both queries take approx. 10ms (and return 0 or 1 resultset, more is not
>>> possible). So quite fast.
> 
>>> Can anyone explain to me, why that is and how we could rewrite the query
>>> to make it fast with a single one as well?
> 
>>> Thanks
> 
>>> chregu
> 
> 

Mime
View raw message