lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bogdan Marinescu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-6700) ChildDocTransformer doesn't return correct children after updating and optimising sol'r index
Date Tue, 04 Nov 2014 09:45:33 GMT

     [ https://issues.apache.org/jira/browse/SOLR-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bogdan Marinescu updated SOLR-6700:
-----------------------------------
    Description: 
I have an index with nested documents. 
{code:title=schema.xml snippet|borderStyle=solid}
 <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"
/>
<field name="entityType" type="int" indexed="true" stored="true" required="true"/>
<field name="pName" type="string" indexed="true" stored="true"/>
<field name="cAlbum" type="string" indexed="true" stored="true"/>
<field name="cSong" type="string" indexed="true" stored="true"/>
<field name="_root_" type="string" indexed="true" stored="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
{code}

Afterwards I add the following documents:
{code}
<add>
  <doc>
    <field name="id">1</field>
    <field name="pName">Test Artist 1</field>
    <field name="entityType">1</field>
    <doc>
        <field name="id">11</field>
        <field name="cAlbum">Test Album 1</field>
	    <field name="cSong">Test Song 1</field>
        <field name="entityType">2</field>
    </doc>
  </doc>
  <doc>
    <field name="id">2</field>
    <field name="pName">Test Artist 2</field>
    <field name="entityType">1</field>
    <doc>
        <field name="id">22</field>
        <field name="cAlbum">Test Album 2</field>
	    <field name="cSong">Test Song 2</field>
        <field name="entityType">2</field>
    </doc>
  </doc>
</add>
{code}

After performing the following query 
{quote}
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3DentityType%3A1%7D&fl=*%2Cscore%2C%5Bchild+parentFilter%3DentityType%3A1%5D&wt=json&indent=true
{quote}
I get a correct answer (child matches parent, check _root_ field)
{code:title=add docs|borderStyle=solid}
{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"*,score,[child parentFilter=entityType:1]",
      "indent":"true",
      "q":"{!parent which=entityType:1}",
      "wt":"json"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"1",
        "pName":"Test Artist 1",
        "entityType":1,
        "_version_":1483832661048819712,
        "_root_":"1",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"11",
          "cAlbum":"Test Album 1",
          "cSong":"Test Song 1",
          "entityType":2,
          "_root_":"1"}]},
      {
        "id":"2",
        "pName":"Test Artist 2",
        "entityType":1,
        "_version_":1483832661050916864,
        "_root_":"2",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"22",
          "cAlbum":"Test Album 2",
          "cSong":"Test Song 2",
          "entityType":2,
          "_root_":"2"}]}]
  }}
{code}

Afterwards I try to update one document:
{code:title=update doc|borderStyle=solid}
<add>
<doc>
<field name="id">1</field>
<field name="pName" update="set">INIT</field>
</doc>
</add>
{code}

After performing the previous query I get the right result (like the previous one but with
the pName field updated).

The problem only comes after performing an *optimize*. 
Now, the same query yields the following result:
{code}
{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"*,score,[child parentFilter=entityType:1]",
      "indent":"true",
      "q":"{!parent which=entityType:1}",
      "wt":"json"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"2",
        "pName":"Test Artist 2",
        "entityType":1,
        "_version_":1483832661050916864,
        "_root_":"2",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"11",
          "cAlbum":"Test Album 1",
          "cSong":"Test Song 1",
          "entityType":2,
          "_root_":"1"},
        {
          "id":"22",
          "cAlbum":"Test Album 2",
          "cSong":"Test Song 2",
          "entityType":2,
          "_root_":"2"}]},
      {
        "id":"1",
        "pName":"INIT",
        "entityType":1,
        "_root_":"1",
        "_version_":1483832916867809280,
        "score":1.0}]
  }}
{code}

As can be seen, the document with id:2 now contains the child with id:11 that belongs to the
document with id:1. 

I haven't found any references on the web about this except http://blog.griddynamics.com/2013/09/solr-block-join-support.html

Is this problem known? Is there a workaround for this? 


  was:
I have an index with nested documents. 
{code:title=schema.xml snippet|borderStyle=solid}
 <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"
/>
<field name="entityType" type="int" indexed="true" stored="true" required="true"/>
<field name="pName" type="string" indexed="true" stored="true"/>
<field name="cAlbum" type="string" indexed="true" stored="true"/>
<field name="cSong" type="string" indexed="true" stored="true"/>
<field name="_root_" type="string" indexed="true" stored="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
{code}

Afterwards I add the following documents:
{code}
<add>
  <doc>
    <field name="id">1</field>
    <field name="pName">Test Artist 1</field>
    <field name="entityType">1</field>
    <doc>
        <field name="id">11</field>
        <field name="cAlbum">Test Album 1</field>
	    <field name="cSong">Test Song 1</field>
        <field name="entityType">2</field>
    </doc>
  </doc>
  <doc>
    <field name="id">2</field>
    <field name="pName">Test Artist 2</field>
    <field name="entityType">1</field>
    <doc>
        <field name="id">22</field>
        <field name="cAlbum">Test Album 2</field>
	    <field name="cSong">Test Song 2</field>
        <field name="entityType">2</field>
    </doc>
  </doc>
</add>
{code}

After performing the following query 
{quote}
http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3DentityType%3A1%7D&fl=*%2Cscore%2C%5Bchild+parentFilter%3DentityType%3A1%5D&wt=json&indent=true
{quote}
I get a correct answer (child matches parent, check _root_ field)
{code:title=add docs|borderStyle=solid}
{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"*,score,[child parentFilter=entityType:1]",
      "indent":"true",
      "q":"{!parent which=entityType:1}",
      "wt":"json"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"1",
        "pName":"Test Artist 1",
        "entityType":1,
        "_version_":1483832661048819712,
        "_root_":"1",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"11",
          "cAlbum":"Test Album 1",
          "cSong":"Test Song 1",
          "entityType":2,
          "_root_":"1"}]},
      {
        "id":"2",
        "pName":"Test Artist 2",
        "entityType":1,
        "_version_":1483832661050916864,
        "_root_":"2",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"22",
          "cAlbum":"Test Album 2",
          "cSong":"Test Song 2",
          "entityType":2,
          "_root_":"2"}]}]
  }}
{code}

Afterwards I try to update one document:
{code:title=update doc|borderStyle=solid}
<add>
<doc>
<field name="id">1</field>
<field name="pName" update="set">INIT</field>
</doc>
</add>
{code}

After performing the previous query I get the right result (like the previous one but with
the pName field updated).

The problem only comes after performing an optimize. 
Now, the same query yields the following result:
{code}
{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "fl":"*,score,[child parentFilter=entityType:1]",
      "indent":"true",
      "q":"{!parent which=entityType:1}",
      "wt":"json"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"2",
        "pName":"Test Artist 2",
        "entityType":1,
        "_version_":1483832661050916864,
        "_root_":"2",
        "score":1.0,
        "_childDocuments_":[
        {
          "id":"11",
          "cAlbum":"Test Album 1",
          "cSong":"Test Song 1",
          "entityType":2,
          "_root_":"1"},
        {
          "id":"22",
          "cAlbum":"Test Album 2",
          "cSong":"Test Song 2",
          "entityType":2,
          "_root_":"2"}]},
      {
        "id":"1",
        "pName":"INIT",
        "entityType":1,
        "_root_":"1",
        "_version_":1483832916867809280,
        "score":1.0}]
  }}
{code}

As can be seen, the document with id:2 now contains the child with id:11 that belongs to the
document with id:1. 

I haven't found any references on the web about this except http://blog.griddynamics.com/2013/09/solr-block-join-support.html

Is this problem known? Is there a workaround for this? 



> ChildDocTransformer doesn't return correct children after updating and optimising sol'r
index
> ---------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6700
>                 URL: https://issues.apache.org/jira/browse/SOLR-6700
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Bogdan Marinescu
>            Priority: Blocker
>             Fix For: 4.10.3, 5.0
>
>
> I have an index with nested documents. 
> {code:title=schema.xml snippet|borderStyle=solid}
>  <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"
/>
> <field name="entityType" type="int" indexed="true" stored="true" required="true"/>
> <field name="pName" type="string" indexed="true" stored="true"/>
> <field name="cAlbum" type="string" indexed="true" stored="true"/>
> <field name="cSong" type="string" indexed="true" stored="true"/>
> <field name="_root_" type="string" indexed="true" stored="true"/>
> <field name="_version_" type="long" indexed="true" stored="true"/>
> {code}
> Afterwards I add the following documents:
> {code}
> <add>
>   <doc>
>     <field name="id">1</field>
>     <field name="pName">Test Artist 1</field>
>     <field name="entityType">1</field>
>     <doc>
>         <field name="id">11</field>
>         <field name="cAlbum">Test Album 1</field>
> 	    <field name="cSong">Test Song 1</field>
>         <field name="entityType">2</field>
>     </doc>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="pName">Test Artist 2</field>
>     <field name="entityType">1</field>
>     <doc>
>         <field name="id">22</field>
>         <field name="cAlbum">Test Album 2</field>
> 	    <field name="cSong">Test Song 2</field>
>         <field name="entityType">2</field>
>     </doc>
>   </doc>
> </add>
> {code}
> After performing the following query 
> {quote}
> http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3DentityType%3A1%7D&fl=*%2Cscore%2C%5Bchild+parentFilter%3DentityType%3A1%5D&wt=json&indent=true
> {quote}
> I get a correct answer (child matches parent, check _root_ field)
> {code:title=add docs|borderStyle=solid}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1,
>     "params":{
>       "fl":"*,score,[child parentFilter=entityType:1]",
>       "indent":"true",
>       "q":"{!parent which=entityType:1}",
>       "wt":"json"}},
>   "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"1",
>         "pName":"Test Artist 1",
>         "entityType":1,
>         "_version_":1483832661048819712,
>         "_root_":"1",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"11",
>           "cAlbum":"Test Album 1",
>           "cSong":"Test Song 1",
>           "entityType":2,
>           "_root_":"1"}]},
>       {
>         "id":"2",
>         "pName":"Test Artist 2",
>         "entityType":1,
>         "_version_":1483832661050916864,
>         "_root_":"2",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"22",
>           "cAlbum":"Test Album 2",
>           "cSong":"Test Song 2",
>           "entityType":2,
>           "_root_":"2"}]}]
>   }}
> {code}
> Afterwards I try to update one document:
> {code:title=update doc|borderStyle=solid}
> <add>
> <doc>
> <field name="id">1</field>
> <field name="pName" update="set">INIT</field>
> </doc>
> </add>
> {code}
> After performing the previous query I get the right result (like the previous one but
with the pName field updated).
> The problem only comes after performing an *optimize*. 
> Now, the same query yields the following result:
> {code}
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1,
>     "params":{
>       "fl":"*,score,[child parentFilter=entityType:1]",
>       "indent":"true",
>       "q":"{!parent which=entityType:1}",
>       "wt":"json"}},
>   "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"2",
>         "pName":"Test Artist 2",
>         "entityType":1,
>         "_version_":1483832661050916864,
>         "_root_":"2",
>         "score":1.0,
>         "_childDocuments_":[
>         {
>           "id":"11",
>           "cAlbum":"Test Album 1",
>           "cSong":"Test Song 1",
>           "entityType":2,
>           "_root_":"1"},
>         {
>           "id":"22",
>           "cAlbum":"Test Album 2",
>           "cSong":"Test Song 2",
>           "entityType":2,
>           "_root_":"2"}]},
>       {
>         "id":"1",
>         "pName":"INIT",
>         "entityType":1,
>         "_root_":"1",
>         "_version_":1483832916867809280,
>         "score":1.0}]
>   }}
> {code}
> As can be seen, the document with id:2 now contains the child with id:11 that belongs
to the document with id:1. 
> I haven't found any references on the web about this except http://blog.griddynamics.com/2013/09/solr-block-join-support.html
> Is this problem known? Is there a workaround for this? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message