lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pankaj Sonawane <pankaj4sonaw...@gmail.com>
Subject Re: Grouping and recip function not working with Sharding
Date Thu, 09 Jul 2015 05:45:29 GMT
Hi Erick,

Below example is for grouping issue not for sorting.
I have indexed 1839 records with 'NAME' field in all, There may be
duplicate record for each 'NAME' value.

Let say
There are 5 records with NAME='A-SERIES',similarly 3 records with
NAME='E-SERIES' etc.

I have total 264 unique NAME values. So when I query collection using
grouping it should return 264 unique groups with "ngroups" value as 264.

But query returns response with "ngroups" as 558, however length of
"groups" array in response is 264.



{
   "responseHeader":{
      "status":0,
      "QTime":19,
      "params":{
         "group.ngroups":"true",
         "indent":"true",
         "q":"*:*",
         "group.field":"NAME",
         "group":"true",
         "wt":"json"
      }
   },
   "grouped":{
      "NAME":{
         "matches":1839,
         "ngroups":558, ----- This value should be 264
         "groups":[
            {
               "groupValue":"A-SERIES",
               "doclist":{

               }
            },
            {
               "groupValue":"B-SERIES",
               "doclist":{

               }
            },
            {
               "groupValue":"C-SERIES",
               "doclist":{

               }
            },
           -----------Similarly there are total 264 such groups----
         ]
      }
   }
}


>From the reference guide:

group.ngroups and group.facet require that all documents in each group
must be co-located on the same shard in order for accurate counts to
be returned. Document routing via composite keys can be a useful
solution in many situations.

It's not clear what you think the prolbem here is. You say:
bq: Ex: Below response contains 5 groups (Which is correct) but
ngroups is 11. But you have rows set to 5 so?

As far as your sorting issue, again an example showing what you think
is wrong would be very helpful.

Best,
Erick



On Wed, Jul 8, 2015 at 6:38 AM, Pankaj Sonawane
<pankaj4sonawane@gmail.com> wrote:
> Hi,
>
> I am using sharding (3 shards) with Zookeeper.
>
> When I query a collection using "
> *group=true&group.field=NAME&group.ngroups=true*" parameters, "*ngroups*" in
> response is incorrect. However I am getting correct count in doclist array.
>
> Ex: Below response contains 5 groups (Which is correct) but ngroups is 11.
>
> {
>    "responseHeader":{
>       "status":0,
>       "QTime":49,
>       "params":{
>          "group.ngroups":"true",
>          "indent":"true",
>          "start":"0",
>          "q":"*:*",
>          "group.field":"NAME",
>          "group":"true",
>          "wt":"json",
>          "rows":"5"
>       }
>    },
>    "grouped":{
>       "NAME":{
>          "matches":18,
>          "ngroups":11,
>          "groups":[
>             {
>                "groupValue":"A-SERIES",
>                "doclist":{
>                   "numFound":5,
>                   "start":0,
>                   "maxScore":1,
>                   "docs":[
>                      {
>                         "NAME":"A-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"B-SERIES",
>                "doclist":{
>                   "numFound":5,
>                   "start":0,
>                   "docs":[
>                      {
>                         "NAME":"B-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"C-SERIES",
>                "doclist":{
>                   "numFound":1,
>                   "start":0,
>                   "docs":[
>                      {
>                         "NAME":"C-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"D-SERIES",
>                "doclist":{
>                   "numFound":5,
>                   "start":0,
>                   "docs":[
>                      {
>                         "NAME":"D-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"E-SERIES",
>                "doclist":{
>                   "numFound":3,
>                   "start":0,
>                   "maxScore":1,
>                   "docs":[
>                      {
>                         "NAME":"E-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             }
>          ]
>       }
>    }
> }
>
> I am facing same problem with Recip function to get latest record on some
> date field when using sharding. It returns back records in wrong order.
>
> Note: Same configuration works fine on single machine without sharding.
>
> Please Help me to find solution.
>
> Thanks.



On Wed, Jul 8, 2015 at 7:08 PM, Pankaj Sonawane <pankaj4sonawane@gmail.com>
wrote:

> Hi,
>
> I am using sharding (3 shards) with Zookeeper.
>
> When I query a collection using "
> *group=true&group.field=NAME&group.ngroups=true*" parameters, "*ngroups*" in
> response is incorrect. However I am getting correct count in doclist array.
>
> Ex: Below response contains 5 groups (Which is correct) but ngroups is 11.
>
> {
>    "responseHeader":{
>       "status":0,
>       "QTime":49,
>       "params":{
>          "group.ngroups":"true",
>          "indent":"true",
>          "start":"0",
>          "q":"*:*",
>          "group.field":"NAME",
>          "group":"true",
>          "wt":"json",
>          "rows":"5"
>       }
>    },
>    "grouped":{
>       "NAME":{
>          "matches":18,
>          "ngroups":11,
>          "groups":[
>             {
>                "groupValue":"A-SERIES",
>                "doclist":{
>                   "numFound":5,
>                   "start":0,
>                   "maxScore":1,
>                   "docs":[
>                      {
>                         "NAME":"A-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"B-SERIES",
>                "doclist":{
>                   "numFound":5,
>                   "start":0,
>                   "docs":[
>                      {
>                         "NAME":"B-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"C-SERIES",
>                "doclist":{
>                   "numFound":1,
>                   "start":0,
>                   "docs":[
>                      {
>                         "NAME":"C-SERIES",
>                         "_version_":1505559209034383400
>                      }
>                   ]
>                }
>             },
>             {
>                "groupValue":"D-SERIES",
>
> ...
>
> [Message clipped]

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message