lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregory Dearing <gregdear...@gmail.com>
Subject Re: including self-joins in parent/child queries
Date Wed, 17 Dec 2014 01:20:43 GMT
Michael,

Note that the index doesn't contain any special information about
block-join relationships... it uses a convention that child docs are
indexed before parent docs (ie. the root doc in each hierarchy has the
largest docId in its block).

This means that it can 'join' to parents just by comparing child
docIds (from the subquery set) to the set of parent docIds.  A child's
parent is the closest parent docId that is larger than the child's
docId.

That explanation is all just to say... if your subquery matched a
parent, then joined to a parent set, and no exception was thrown, the
resulting answer will be in the NEXT BOOK.  (The closest docId that is
larger than a parent's docId in the parent set, will be from another
document block)

I would suggest using different field names for each level of a block
hierarchy, just so you can be sure what level your original query
actually hits.  You could accomplish the same by adding a 'docType'
field.

In your case, you might consider pushing your 'Book' level fields into
a special child doc.  For example, your Book document could have no
searchable fields; its children could include both 'Chapter' child
docs and also a 'BookMetadata' child doc.

-Greg




On Tue, Dec 16, 2014 at 10:42 AM, Michael Sokolov
<msokolov@safaribooksonline.com> wrote:
> OK - I see looking at the code that an exception is thrown if a parent doc
> matches the subquery -- so that explains what will happen, but I guess my
> further question is -- is that necessary? Could we just not throw an
> exception there?
>
> -Mike
>
>
> On 12/16/2014 10:38 AM, Michael Sokolov wrote:
>>
>> I see in the docs of ToParentBlockJoinQuery that:
>>
>>  * The child documents must be orthogonal to the parent
>>  * documents: the wrapped child query must never
>>  * return a parent document.
>>
>> First, it would be helpful if the docs explained what would happen if that
>> assumption were violated.
>>
>> Second, I want to do that!
>>
>> My parent documents have the same fields as their child documents (title,
>> text, etc): in some cases the best match for a query is the entire book, (ie
>> a query for "Java Programming"), in other cases it is a specific chapter (a
>> query for "Java regular expressions").
>>
>> Currently I am using Solr grouping queries to roll up parent and child,
>> but I am hoping to get a performance boost by using the parent/child
>> indexing which is a natural for us since we always index a book at a time.
>>
>> If need be, I will simply index a child document that represents the
>> parent (ie duplicate the parent document but with a different type so as to
>> exclude it from the join subquery), but is this really necessary? If so, can
>> you explain why?
>>
>>
>> Thanks
>>
>> -Mike
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message