Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E9D0910C21 for ; Sun, 16 Mar 2014 10:48:02 +0000 (UTC) Received: (qmail 47398 invoked by uid 500); 16 Mar 2014 10:47:58 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 46852 invoked by uid 500); 16 Mar 2014 10:47:49 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 46844 invoked by uid 99); 16 Mar 2014 10:47:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Mar 2014 10:47:47 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dannytei1@gmail.com designates 209.85.213.179 as permitted sender) Received: from [209.85.213.179] (HELO mail-ig0-f179.google.com) (209.85.213.179) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Mar 2014 10:47:43 +0000 Received: by mail-ig0-f179.google.com with SMTP id t19so3021763igi.0 for ; Sun, 16 Mar 2014 03:47:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=Lf2V+F36AqEIZNXSRt193TFeqUxct8E1hNxGrt22yZA=; b=I2Nf3vj9KV9suJeIH7UIqfi6xt7TK9tHftsn2nhU35fE3FqjfG/KyPga3NgxP8pPN/ lcszVjpmEa/xHCRYnvRzbm29g6zOr8BpcysrleewepHMc+TLRGxOSXMbglS+oSK3632+ H0OYEGSpaTmVfpkkoxftkInIpPqFhMrCGidNYs9pYm5nsexAR71ntMEhnvsprENlem89 3egfSRGRB+4MklhTpFsNdLUNMLOqmyBSm/rGPtlrmflJNfi8vbmfrJ1yroYaLbR1dyXc /rye0HtnLvJunCsyGcw1uU9Nnr8xJCt/oCVEu2RkTm/UOMJCks1Wui1GO2AVbSLrSase 3faw== MIME-Version: 1.0 X-Received: by 10.50.30.225 with SMTP id v1mr7196740igh.26.1394966842960; Sun, 16 Mar 2014 03:47:22 -0700 (PDT) Received: by 10.64.230.141 with HTTP; Sun, 16 Mar 2014 03:47:22 -0700 (PDT) Date: Sun, 16 Mar 2014 12:47:22 +0200 Message-ID: Subject: Nested documents, block join - re-indexing a single document upon update From: danny teichthal To: solr-user@lucene.apache.org Content-Type: multipart/mixed; boundary=047d7ba97a9426099204f4b70700 X-Virus-Checked: Checked by ClamAV on apache.org --047d7ba97a9426099204f4b70700 Content-Type: multipart/alternative; boundary=047d7ba97a9426098f04f4b707fe --047d7ba97a9426098f04f4b707fe Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi All, To make things short, I would like to use block joins, but to be able to index each document on the block separately. Is it possible? In more details: We have some nested parent-child structure where: 1. Parent may have a single level of children 2. Parent and child documents may be updated independently. 3. We may want to search for parent by child info and vise versa. At first we thought of managing the parent and child in different documents, de-normalizing child data at parent level and parent data on child level. After reading *Mikhail*blog http://blog.griddynamics.com/2013/09/solr-block-join-support.html, we thought of using the block join for this purpose. But, I got into a wall when trying to update a single child document. For me, it's ok if SOLR will internally index the whole block, I just don't want to fetch the whole hierarchy from DB for update. I was trying to achieve this using atomic updates - since all the fields must be stored anyway - if I send an atomic update on one of the children with the _*root*_ field then there's no need to send the whole hierarchy. But, when I try this method, I see that the child document is indeed updated, but it's order is changed to be after the parent. This is what I did: 1. Change the root field to be stored - 2. Put attached docs on example\exampledocs. 3. Run post.jar on parent-child.xml 4. Run post.jar on update-child-atomic.xml. 5. Now - http://localhost:8983/solr/collection1/select?q=3D%7B!parent+which%3D%27typ= e_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0A&wt=3Djson&indent=3Dtru= e, returns parent 10 as expected. 6. But, http://localhost:8983/solr/collection1/select?q=3D{!parent+which%3D'type_s%= 3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%3AXL%0A&wt=3Djson&indent=3Dtrue- returns nothing. 7. When searching *:* on Admin, record with id=3D12 was updated wit= h 'Green', but it is returned below the parent record. 8. Thanks in advance. In case the attachments does not work: 1st file to post: *:* 10 parent Nike 11 Red XL 12 Blue XL 2nd file: 12 Green XL 10 --047d7ba97a9426098f04f4b707fe Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


Hi All,


To make things short, I would like to use block join= s, but to be able to index each document on the block separately.=

Is it possible?

 

In more details:

 

We have some nested parent-child structure where:=

1.&n= bsp;      Parent may have a single leve= l of children

2.&n= bsp;      Parent and child documents ma= y be updated independently.

3.&n= bsp;      We may want to search for par= ent by child info and vise versa.

 

At first we thought of managing the parent and child= in different documents, de-normalizing child data at parent level and pare= nt data on child level.

After reading = Mikhail blog http://blog.griddynamics.com/2013/09/solr-block-join-s= upport.html, we thought of using the block join for this purpose.

 

But, I got into a wall when trying to update a singl= e child document.

For me, it’s ok if SOLR will internally index = the whole block, I just don’t want to fetch the whole hierarchy from = DB for update.

 

I was trying to achieve this using atomic updates &n= dash; since all the fields must be stored anyway – if I send an atomi= c update on one of the children with the _root_ field then there&rsq= uo;s no need to send the whole hierarchy.

But, when I try this method, I see that the child do= cument is indeed updated, but it’s order is changed to be after the p= arent.

 

This is what I did:

1.&n= bsp;      Change the root field to be s= tored - <field name=3D"_root_" type=3D"string" index= ed=3D"true" stored=3D"true"/>

2.&n= bsp;      Put attached docs on example\= exampledocs.

3.&n= bsp;      Run post.jar on parent-child.= xml

4.&n= bsp;      Run post.jar on update-child-= atomic.xml.

5.&n= bsp;      Now - http://localhost:8983/solr/collection1/select?q=3D%7B!parent+which%3D%27typ= e_s%3Aparent%27%7D%2BCOLOR_s%3ARed+%2BSIZE_s%3AXL%0A&wt=3Djson&inde= nt=3Dtrue, returns parent 10 as expected.

6.&n= bsp;      But,  http://localhost:8983/solr/collection1/select?q= =3D{!parent+which%3D'type_s%3Aparent'}%2BCOLOR_s%3AGreen+%2BSIZE_s%= 3AXL%0A&wt=3Djson&indent=3Dtrue – returns nothing.

7.&n= bsp;      When searching *:* on Admin, =   record with id=3D12 was updated with ‘Green’, but i= t is returned below the parent record.

8.&n= bsp;       

 

Thanks in advance.

 

In case the attachments does not work:=

1st file to post:

 

<update>

  <delete><query>*:*</query>&= lt;/delete>

  <add>

    <doc>

      <field name=3D&quo= t;id">10</field>

      <field name=3D&quo= t;type_s">parent</field>

      <field name=3D&quo= t;BRAND_s">Nike</field>

      <doc>=

        <field= name=3D"id">11</field>

        <field= name=3D"COLOR_s">Red</field>

        <field= name=3D"SIZE_s">XL</field>

      </doc>

      <doc>=

        <field= name=3D"id">12</field>

        <field= name=3D"COLOR_s">Blue</field>

        <field= name=3D"SIZE_s">XL</field>

      </doc>

    </doc>   

  </add>

  <commit/>

</update>

 

 

 

2nd file:

 

<update> 

  <add>    =

      <doc>

        <field= name=3D"id">12</field>

        <field= name=3D"COLOR_s" update=3D"set">Green</field>=

        <field= name=3D"SIZE_s">XL</field>

        &nbs= p;            &= nbsp;          <field name= =3D"_root_" >10</field>

      </doc>

    </doc>   

  </add>

  <commit/>

</update>


--047d7ba97a9426098f04f4b707fe-- --047d7ba97a9426099204f4b70700 Content-Type: text/xml; charset=US-ASCII; name="parent-child.xml" Content-Disposition: attachment; filename="parent-child.xml" Content-Transfer-Encoding: base64 X-Attachment-Id: 6099c8e12b98b878_0.1 PHVwZGF0ZT4NCiAgPGRlbGV0ZT48cXVlcnk+KjoqPC9xdWVyeT48L2RlbGV0ZT4NCiAgPGFkZD4N CiAgICA8ZG9jPg0KICAgICAgPGZpZWxkIG5hbWU9ImlkIj4xMDwvZmllbGQ+DQogICAgICA8Zmll bGQgbmFtZT0idHlwZV9zIj5wYXJlbnQ8L2ZpZWxkPg0KICAgICAgPGZpZWxkIG5hbWU9IkJSQU5E X3MiPk5pa2U8L2ZpZWxkPg0KICAgICAgPGRvYz4NCiAgICAgICAgPGZpZWxkIG5hbWU9ImlkIj4x MTwvZmllbGQ+DQogICAgICAgIDxmaWVsZCBuYW1lPSJDT0xPUl9zIj5SZWQ8L2ZpZWxkPg0KICAg ICAgICA8ZmllbGQgbmFtZT0iU0laRV9zIj5YTDwvZmllbGQ+DQogICAgICA8L2RvYz4NCiAgICAg IDxkb2M+DQogICAgICAgIDxmaWVsZCBuYW1lPSJpZCI+MTI8L2ZpZWxkPg0KICAgICAgICA8Zmll bGQgbmFtZT0iQ09MT1JfcyI+Qmx1ZTwvZmllbGQ+DQogICAgICAgIDxmaWVsZCBuYW1lPSJTSVpF X3MiPlhMPC9maWVsZD4NCiAgICAgIDwvZG9jPg0KICAgIDwvZG9jPiAgICANCiAgPC9hZGQ+DQog IDxjb21taXQvPg0KPC91cGRhdGU+ --047d7ba97a9426099204f4b70700 Content-Type: text/xml; charset=US-ASCII; name="update-child-atomic.xml" Content-Disposition: attachment; filename="update-child-atomic.xml" Content-Transfer-Encoding: base64 X-Attachment-Id: 6099c8e12b98b878_0.2 PHVwZGF0ZT4gIA0KICA8YWRkPiAgICANCiAgICAgIDxkb2M+DQogICAgICAgIDxmaWVsZCBuYW1l PSJpZCI+MTI8L2ZpZWxkPg0KICAgICAgICA8ZmllbGQgbmFtZT0iQ09MT1JfcyIgdXBkYXRlPSJz ZXQiPkdyZWVuPC9maWVsZD4NCiAgICAgICAgPGZpZWxkIG5hbWU9IlNJWkVfcyI+WEw8L2ZpZWxk Pg0KCQk8ZmllbGQgbmFtZT0iX3Jvb3RfIiA+MTA8L2ZpZWxkPg0KICAgICAgPC9kb2M+DQogICAg PC9kb2M+ICAgIA0KICA8L2FkZD4NCiAgPGNvbW1pdC8+DQo8L3VwZGF0ZT4= --047d7ba97a9426099204f4b70700--