lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <m...@apache.org>
Subject Re: Pagination bug? when sorting by a field (not unique field)
Date Wed, 29 Mar 2017 13:35:18 GMT
Can it happen that replicas are different by deleted docs? I mean numDocs
is the same, but maxDocs is different by number of deleted docs, you can
see it in solr admin at the core page.

On Wed, Mar 29, 2017 at 4:16 PM, Pablo Anzorena <anzorena.fing@gmail.com>
wrote:

> Shawn,
>
> Yes, the field has duplicate values and yes, if I add the secondary sort by
> the uniqueKey it solve the issue.
>
> Those 2 situations you mentioned are not occurring, none of them. The index
> is replicated, but not sharded.
>
> Does solr sort by an internal id if no uniqueKey is present in the sort?
>
> 2017-03-29 9:58 GMT-03:00 Shawn Heisey <apache@elyograg.org>:
>
> > On 3/29/2017 6:35 AM, Pablo Anzorena wrote:
> > > I was paginating the results of a query and noticed that some
> > > documents were repeated across pagination buckets of 100 rows. When I
> > > sort by the unique field there is no repeated document but when I sort
> > > by another field then repeated documents appear. I assume is a bug and
> > > it's not the intended behaviour, right?
> >
> > There is a potential situation that can cause this problem that is NOT a
> > bug.
> >
> > If the field you are sorting on contains duplicate values (same value in
> > multiple documents), then I am pretty sure that the sort order of
> > documents with the same value in the sort field is non-deterministic in
> > these situations:
> >
> > 1) A distributed (sharded) index.
> > 2) When the index contents can change between a request for one page and
> > a request for the next page -- documents being added, deleted, or
> changed.
> >
> > Because the sort order of documents with the same value can change, one
> > document that may have ended up on the first page on the first query may
> > end up on the second page on the second query.
> >
> > Sorting by a field with no duplicate values (the unique field you
> > mentioned) will always result in the exact same sort order ... but if
> > you add documents that sort to near the start of the sort order between
> > queries, the behavior you have noticed can still happen.
> >
> > If this is what you are encountering, adding secondary sort on the
> > uniqueKey field would probably clear up the problem.  If your uniqueKey
> > field is "id", something like this:
> >
> > sort=someField desc,id desc
> >
> > Thanks,
> > Shawn
> >
> >
>



-- 
Sincerely yours
Mikhail Khludnev

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message