lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From danny teichthal <dannyt...@gmail.com>
Subject Nested documents, block join - re-indexing a single document upon update
Date Sun, 16 Mar 2014 10:47:22 GMT
 Hi All,


 To make things short, I would like to use block joins, but to be able to
index each document on the block separately.

Is it possible?



In more details:



We have some nested parent-child structure where:

1.       Parent may have a single level of children

2.       Parent and child documents may be updated independently.

3.       We may want to search for parent by child info and vise versa.



At first we thought of managing the parent and child in different
documents, de-normalizing child data at parent level and parent data on
child level.

After reading *Mikhail*<http://www.blogger.com/profile/03731629466352186647>blog
http://blog.griddynamics.com/2013/09/solr-block-join-support.html, we
thought of using the block join for this purpose.



But, I got into a wall when trying to update a single child document.

For me, it's ok if SOLR will internally index the whole block, I just don't
want to fetch the whole hierarchy from DB for update.



I was trying to achieve this using atomic updates - since all the fields
must be stored anyway - if I send an atomic update on one of the children
with the _*root*_ field then there's no need to send the whole hierarchy.

But, when I try this method, I see that the child document is indeed
updated, but it's order is changed to be after the parent.



This is what I did:

1.       Change the root field to be stored - <field name="_root_"
type="string" indexed="true" stored="true"/>

2.       Put attached docs on example\exampledocs.

3.       Run post.jar on parent-child.xml

4.       Run post.jar on update-child-atomic.xml.

5.       Now -
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3D%27type_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0A&wt=json&indent=true,
returns parent 10 as expected.

6.       But,
http://localhost:8983/solr/collection1/select?q={!parent+which%3D'type_s%3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0A&wt=json&indent=true<http://localhost:8983/solr/collection1/select?q=%7b!parent+which%3D'type_s%3Aparent'%7d%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0A&wt=json&indent=true>-
returns nothing.

7.       When searching *:* on Admin,   record with id=12 was updated with
'Green', but it is returned below the parent record.

8.



Thanks in advance.



In case the attachments does not work:

1st file to post:



<update>

  <delete><query>*:*</query></delete>

  <add>

    <doc>

      <field name="id">10</field>

      <field name="type_s">parent</field>

      <field name="BRAND_s">Nike</field>

      <doc>

        <field name="id">11</field>

        <field name="COLOR_s">Red</field>

        <field name="SIZE_s">XL</field>

      </doc>

      <doc>

        <field name="id">12</field>

        <field name="COLOR_s">Blue</field>

        <field name="SIZE_s">XL</field>

      </doc>

    </doc>

  </add>

  <commit/>

</update>







2nd file:



<update>

  <add>

      <doc>

        <field name="id">12</field>

        <field name="COLOR_s" update="set">Green</field>

        <field name="SIZE_s">XL</field>

                                <field name="_root_" >10</field>

      </doc>

    </doc>

  </add>

  <commit/>

</update>

Mime
View raw message