lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: ToChildBlockJoinQuery question
Date Thu, 22 Jan 2015 15:37:38 GMT
Yeah I know -- we've been around this block before.  I agree that the 
whole block indexing/searching feature is a bit confusing, trappy and 
error-prone, and it may be helpful to have those boundary conditions as 
signposts, but in my case relaxing the restriction enabled me to execute 
the queries I want without having to write a lot of awkward extensions 
to my indexing code.  That code uses Python's haystack, which is based 
on django models, and in order to comply with the parent-not-its-child 
idea, I would have had to introduce dummy documents to stand in as the 
parents, something that isn't at all natural or straightforward in that 
django/haystack view of the world.  Maybe the enforcement of that 
restriction could be relaxed according to an option in the query 
constructor.

-Mike

On 1/22/15 10:29 AM, Gregory Dearing wrote:
> Mike,
>
> I agree that it's not absolutely necessary to enforce children not being
> their own parent.  I was just trying to describe the current
> implementation, and why you were seeing exceptions.
>
> The difference is mostly philosophical.  The advantage of the current
> approach (in my opinion) is that it the BlockJoin mechanic has a lot of
> terrible edge cases if used naively, and enforcing "child can't be its own
> parent" can help catch quite a few of them.
>
> I had a discussion on this list on the same topic, which might be useful: Re:
> including self-joins in parent/child queries
> <http://mail-archives.apache.org/mod_mbox/lucene-java-user/201412.mbox/%3CCAASL1-_ppmCNQ3aPJjFbT3AdB4pgaSpvE-8o5R9gV5kLDpf++A@mail.gmail.com%3E>
>
> -Greg
>
> On Wed, Jan 21, 2015 at 7:59 PM, Michael Sokolov <
> msokolov@safaribooksonline.com> wrote:
>
>> On 1/21/2015 6:59 PM, Gregory Dearing wrote:
>>
>>> Jim,
>>>
>>> I think you hit the nail on the head... that's not what BlockJoinQueries
>>> do.
>>>
>>> If you're wanting to search for children and join to their parents... then
>>> use ToParentBlockJoinQuery, with a query that matches the set of children
>>> and a filter that matches the set of parents.
>>>
>>> If you're searching for parents, then joining to their children... then
>>> use
>>> ToChildBlockJoinQuery, with a query that matches the set of parents and a
>>> filter that matches the set of children.
>>>
>>> When you add related documents to the index (via addDocuments), make that
>>> children are added before their parents.
>>>
>>> The reason all the above is necessary is that it makes it possible to have
>>> a nested hierarchy of relationships (ie. Parents have Children, which have
>>> Children of their own).  You need a query to indicate which part of the
>>> hierarchy you're starting from, and a filter indicating which part of the
>>> hierarchy you're joining to.
>>>
>>> Also, you will always get an exception if your query and your filter both
>>> match the same document.  A child can't be its own parent.
>>>
>> That's true for the existing implementation, but seems unnecessary from
>> what I can tell.  See https://github.com/safarijv/
>> ifpress-solr-plugin/blob/master/src/main/java/com/
>> ifactory/press/db/solr/search/SafariBlockJoinQuery.java for a variant
>> that allows a child to be its own parent.
>>
>> -Mike
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message