lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (Commented) (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3602) Add join query to Lucene
Date Mon, 12 Dec 2011 16:13:30 GMT


Martijn van Groningen commented on LUCENE-3602:

bq. Maybe rename actualQuery to fromQuery?
Yes, fromQuery makes more sense than actualQuery.

Why preComputedFromDocs...? Like if you were to cache something,
wouldn't you want cache the toSearcher's bitset instead?
This is in the case if your from query was cached and your toSearch's
bitset isn't, which is a likely scenario.
But caching the toSearcher's bitset is better off course when
possible. But this should be happen outside the JoinQuery, right?

bq. Maybe rename JoinQueryWeight.joinResult to topLevelJoinResult,
I agree a much more descriptive name.

I wonder if we could make this a Filter instead, somehow? Ie, at
its core it converts a top-level bitset in the fromSearcher doc
space into the joined bitset in the toSearcher doc space. It
could even maybe just be a static method taking in fromBitset and
returning toBitset, which could operate per-segment on the
toSearcher side? (Separately: I wonder if JoinQuery should do
something with the scores of the fromQuery....? Not right now but
maybe later...).
It just matches docs from one side to the to side. That is all... So static method / filter
should be able to do the job.
I'm not sure, but if it is a query it might be able to one day encapsulate the joining in
the Lucene query language?

bq. Maybe reword that to state that all joined to/from docs must reside in the same shard?

I wonder if we could DocTermOrds instead? (Or,
FieldCache.DocTermsIndex or DocValues.BYTES_*, if we know
fromSearcher.fromField is
single-valued). This way we uninvert once (on init), and then doing
the join should be much faster since for each fromDocID we can lookup
the term(s) to join on.
I really like that idea! This already crossed my mind a few days ago
as an improvement to speedup the joining. Would be nice if the user can 
choose between a more ram but faster variant and a less ram but slower variant.
I think we can just make two concrete JoinQuery impl that both have a different
joinResult(...) impl.
> Add join query to Lucene
> ------------------------
>                 Key: LUCENE-3602
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/join
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3602.patch, LUCENE-3602.patch
> Solr has (psuedo) join query for a while now. I think this should also be available in

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message