lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: Performance of cross join vs block join
Date Thu, 11 Jul 2013 11:25:44 GMT
Mihaela,

For me it's reasonable that single core join takes the same time as cross
core one. I just can't see which gain can be obtained from in the former
case.
I hardly able to comment join code, I looked into, it's not trivial, at
least. With block join it doesn't need to obtain parentId term
values/numbers and lookup parents by them. Both of these actions are
expensive. Also blockjoin works as an iterator, but join need to allocate
memory for parents bitset and populate it out of order that impacts
scalability.
Also in None scoring mode BJQ don't need to walk through all children, but
only hits first. Also, nice feature is 'both side leapfrog' if you have a
highly restrictive filter/query intersects with BJQ, it allows to skip many
parents and children as well, that's not possible in Join, which has fairly
'full-scan' nature.
Main performance factor for Join is number of child docs.
I'm not sure I got all your questions, please specify them in more details,
if something is still unclear.
have you saw my benchmark
http://blog.griddynamics.com/2012/08/block-join-query-performs.html ?



On Thu, Jul 11, 2013 at 1:52 PM, mihaela olteanu <mihaela_ol@yahoo.com>wrote:

> Hello,
>
> Does anyone know about some measurements in terms of performance for cross
> joins compared to joins inside a single index?
>
> Is it faster the join inside a single index that stores all documents of
> various types (from parent table or from children tables)with a
> discriminator field compared to the cross join (basically in this case each
> document type resides in its own index)?
>
> I have performed some tests but to me it seems that having a join in a
> single index (bigger index) does not add too much speed improvements
> compared to cross joins.
>
> Why a block join would be faster than a cross join if this is the case?
> What are the variables that count when trying to improve the query
> execution time?
>
> Thanks!
> Mihaela




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
 <mkhludnev@griddynamics.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message