lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Narsi <bnars...@gmail.com>
Subject Re: Query results change
Date Fri, 15 Jan 2016 21:12:58 GMT
Data is indexed using Data Import Handler with clean=true, commit=true and
optimize=true. After that there are no updates or delete.

The setup is SolrCloud with 2 shards and 2 replicas each.

If the data and query has not changed, one expects to see the same results
on repeated searches; so it is a matter of users confidence in search
results.

Thanks

On Fri, Jan 15, 2016 at 10:12 AM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Probably the fact that information from deleted/updated
> documents is still hanging around in the corpus until
> merged away.
>
> The nub of the issue is that terms in deleted documents
> (or the replaced doc if you update) still influence tf/idf
> calculations. If you optimize as Binoy suggests, all of
> the information relating to deleted docs is removed.
>
> If this is a SolrCloud setup, you can be getting
> scores from different replicas of the same shard. Due to
> the fact that merging (which purges deleted information)
> can occur at different times on different replicas, the scores
> calculated for a particular doc might be different depending
> on which replica calculated it.
>
> In either setup (SolrCloud or not), background merging can
> change the result order by removing information associated
> with deleted docs.
>
> All that said, does this have _practical_ consequences or
> is this mostly a curiosity question?
>
> Best,
> Erick
>
> On Fri, Jan 15, 2016 at 5:40 AM, Binoy Dalal <binoydalal93@gmail.com>
> wrote:
> > You should try debugging such queries to see how exactly they're being
> > executed.
> > That will give you an idea as to why you're seeing the results you see.
> >
> > On Fri, 15 Jan 2016, 19:05 Brian Narsi <bnarsi70@gmail.com> wrote:
> >
> >> We have an index of 25 fields. Currently number of records in index is
> >> about 120,000. We are using
> >>
> >> parser: edismax
> >>
> >> qf: contains 8 fields
> >>
> >> fq: 1 field
> >>
> >> mm = 1
> >>
> >> qs = 6
> >>
> >> pf: containing g 3 fields
> >>
> >> bf: containing 1 field
> >>
> >> We have noticed that sometimes results change between two searches even
> if
> >> everything is constant.
> >>
> >> What we have identified is if we reindex data and optimize it remedies
> the
> >> situation.
> >>
> >> Is that expected behavior? Or should we also look into other factors?
> >>
> >> Thanks
> >>
> > --
> > Regards,
> > Binoy Dalal
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message