lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-3171) BlockJoinQuery/Collector
Date Sat, 04 Jun 2011 16:18:47 GMT
BlockJoinQuery/Collector
------------------------

                 Key: LUCENE-3171
                 URL: https://issues.apache.org/jira/browse/LUCENE-3171
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/other
            Reporter: Michael McCandless
             Fix For: 3.3, 4.0


I created a single-pass Query + Collector to implement nested docs.
The approach is similar to LUCENE-2454, in that the app must index
documents in "join order", as a block (IW.add/updateDocuments), with
the parent doc at the end of the block, except that this impl is one
pass.

Once you join at indexing time, you can take any query that matches
child docs and join it up to the parent docID space, using
BlockJoinQuery.  You then use BlockJoinCollector, which sorts parent
docs by provided Sort, to gather results, grouped by parent; this
collector finds any BlockJoinQuerys (using Scorer.visitScorers) and
retains the child docs corresponding to each collected parent doc.

After searching is done, you retrieve the TopGroups from a provided
BlockJoinQuery.

Like LUCENE-2454, this is less general than the arbitrary joins in
Solr (SOLR-2272) or parent/child from ElasticSearch
(https://github.com/elasticsearch/elasticsearch/issues/553), since you
must do the join at indexing time as a doc block, but it should be
able to handle nested joins as well as joins to multiple tables,
though I don't yet have test cases for these.

I put this in a new Join module (modules/join); I think as we
refactor join impls we should put them here.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message