lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From danny teichthal <dannyt...@gmail.com>
Subject Re: Nested documents, block join - re-indexing a single document upon update
Date Tue, 18 Mar 2014 07:58:44 GMT
Thanks Jack,
I understand that updating a single document on a block is currently not
supported.
But,  atomic update to a single document does not have to be in conflict
with block joins.

If I got it right from the documentation:
Currently, If a document is atomically  updated, SOLR finds the stored
document, and re index it, changing only the fields that were specified as
update="<operation>".

It looked intuitive that if we specify the _root_ while using atomic
update, SOLR will find the whole block by _root_, update the changed
document, and re-index the whole block.

Of course I cannot estimate the feasibility and effort of implementing it,
but it looks like a nice enhancement.



On Sun, Mar 16, 2014 at 4:09 PM, Jack Krupansky <jack@basetechnology.com>wrote:

> You stumbled upon the whole point of block join - that the documents are
> and must be managed as a block and not individually.
>
> -- Jack Krupansky
>
> From: danny teichthal
> Sent: Sunday, March 16, 2014 6:47 AM
> To: solr-user@lucene.apache.org
> Subject: Nested documents, block join - re-indexing a single document upon
> update
>
>
>
>
> Hi All,
>
>
>
>
> To make things short, I would like to use block joins, but to be able to
> index each document on the block separately.
>
> Is it possible?
>
>
>
> In more details:
>
>
>
> We have some nested parent-child structure where:
>
> 1.       Parent may have a single level of children
>
> 2.       Parent and child documents may be updated independently.
>
> 3.       We may want to search for parent by child info and vise versa.
>
>
>
> At first we thought of managing the parent and child in different
> documents, de-normalizing child data at parent level and parent data on
> child level.
>
> After reading Mikhail blog
> http://blog.griddynamics.com/2013/09/solr-block-join-support.html, we
> thought of using the block join for this purpose.
>
>
>
> But, I got into a wall when trying to update a single child document.
>
> For me, it's ok if SOLR will internally index the whole block, I just
> don't want to fetch the whole hierarchy from DB for update.
>
>
>
> I was trying to achieve this using atomic updates - since all the fields
> must be stored anyway - if I send an atomic update on one of the children
> with the _root_ field then there's no need to send the whole hierarchy.
>
> But, when I try this method, I see that the child document is indeed
> updated, but it's order is changed to be after the parent.
>
>
>
> This is what I did:
>
> 1.       Change the root field to be stored - <field name="_root_"
> type="string" indexed="true" stored="true"/>
>
> 2.       Put attached docs on example\exampledocs.
>
> 3.       Run post.jar on parent-child.xml
>
> 4.       Run post.jar on update-child-atomic.xml.
>
> 5.       Now -
> http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3D%27type_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0A&wt=json&indent=true,
> returns parent 10 as expected.
>
> 6.       But,
> http://localhost:8983/solr/collection1/select?q={!parent+which%3D'type_s%3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0A&wt=json&indent=true-
returns nothing.
>
> 7.       When searching *:* on Admin,   record with id=12 was updated with
> 'Green', but it is returned below the parent record.
>
> 8.
>
>
>
> Thanks in advance.
>
>
>
> In case the attachments does not work:
>
> 1st file to post:
>
>
>
> <update>
>
>   <delete><query>*:*</query></delete>
>
>   <add>
>
>     <doc>
>
>       <field name="id">10</field>
>
>       <field name="type_s">parent</field>
>
>       <field name="BRAND_s">Nike</field>
>
>       <doc>
>
>         <field name="id">11</field>
>
>         <field name="COLOR_s">Red</field>
>
>         <field name="SIZE_s">XL</field>
>
>       </doc>
>
>       <doc>
>
>         <field name="id">12</field>
>
>         <field name="COLOR_s">Blue</field>
>
>         <field name="SIZE_s">XL</field>
>
>       </doc>
>
>     </doc>
>
>   </add>
>
>   <commit/>
>
> </update>
>
>
>
>
>
>
>
> 2nd file:
>
>
>
> <update>
>
>   <add>
>
>       <doc>
>
>         <field name="id">12</field>
>
>         <field name="COLOR_s" update="set">Green</field>
>
>         <field name="SIZE_s">XL</field>
>
>                                 <field name="_root_" >10</field>
>
>       </doc>
>
>     </doc>
>
>   </add>
>
>   <commit/>
>
> </update>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message