lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-5878) Incorrect number of rows returned in distributed search with group.format=simple
Date Wed, 19 Mar 2014 15:51:45 GMT

     [ https://issues.apache.org/jira/browse/SOLR-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erick Erickson updated SOLR-5878:
---------------------------------

    Description: 
The original description (left in below) is something of a red herring. The URL has rows=5
and group.format=simple, yet a bunch more rows are returned. This doesn't seem right given
the Wiki description of format=simple, either the code is a problem or the Wiki needs updating.


Original description:
Solr returns duplicate documents when group.format=simple is supplied on a distributed search.
This does not happen on the standard group format or when not using distributed search. 

For example:
{code}
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*%3A*&fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)&start=0&rows=5&fl=cont_stub&wt=xml&indent=true&group=true&group.field=cont_stub&group.format=simple&group.limit=1000
{code}

Returns:
{code}
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">253</int>
</lst>
<lst name="grouped">
  <lst name="cont_stub">
    <int name="matches">56</int>
    <result name="doclist" numFound="56" start="0" maxScore="1.0">
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
    </result>
  </lst>
</lst>
</response>
{code}

It should only return 5 documents.  Removing the distributed search and searching on either
core will return the requested number of rows. Removing group.format=simple will also return
the requested number of rows.

  was:
Solr returns duplicate documents when group.format=simple is supplied on a distributed search.
This does not happen on the standard group format or when not using distributed search. 

For example:
{code}
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*%3A*&fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)&start=0&rows=5&fl=cont_stub&wt=xml&indent=true&group=true&group.field=cont_stub&group.format=simple&group.limit=1000
{code}

Returns:
{code}
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">253</int>
</lst>
<lst name="grouped">
  <lst name="cont_stub">
    <int name="matches">56</int>
    <result name="doclist" numFound="56" start="0" maxScore="1.0">
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
      <doc>
        <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
    </result>
  </lst>
</lst>
</response>
{code}

It should only return 5 documents.  Removing the distributed search and searching on either
core will return the requested number of rows. Removing group.format=simple will also return
the requested number of rows.

        Summary: Incorrect number of rows returned in distributed search with group.format=simple
 (was: Solr returns duplicates when using distributed search with group.format=simple)

> Incorrect number of rows returned in distributed search with group.format=simple
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-5878
>                 URL: https://issues.apache.org/jira/browse/SOLR-5878
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.6
>            Reporter: J.B. Langston
>
> The original description (left in below) is something of a red herring. The URL has rows=5
and group.format=simple, yet a bunch more rows are returned. This doesn't seem right given
the Wiki description of format=simple, either the code is a problem or the Wiki needs updating.
> Original description:
> Solr returns duplicate documents when group.format=simple is supplied on a distributed
search. This does not happen on the standard group format or when not using distributed search.

> For example:
> {code}
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*%3A*&fq=evt_stub%3A(452deed8-c3a2-49a8-878d-8356da315e6a)&start=0&rows=5&fl=cont_stub&wt=xml&indent=true&group=true&group.field=cont_stub&group.format=simple&group.limit=1000
> {code}
> Returns:
> {code}
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader">
>   <int name="status">0</int>
>   <int name="QTime">253</int>
> </lst>
> <lst name="grouped">
>   <lst name="cont_stub">
>     <int name="matches">56</int>
>     <result name="doclist" numFound="56" start="0" maxScore="1.0">
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">e60eb0f9-bce7-4da9-819c-b356dfc1c4f7</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">faf0a7ea-4252-4eda-990a-4fcc6b5e63e3</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">dd94ec0b-f171-441d-8fb8-af6a22ebf168</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">feede138-2fe4-4742-ac63-e7cecfd86c81</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>       <doc>
>         <str name="cont_stub">86944a90-033d-4676-9ac3-b59744fc52a5</str></doc>
>     </result>
>   </lst>
> </lst>
> </response>
> {code}
> It should only return 5 documents.  Removing the distributed search and searching on
either core will return the requested number of rows. Removing group.format=simple will also
return the requested number of rows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message