lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Inconsistent Solr Search Results
Date Thu, 23 Jul 2015 21:20:53 GMT
Same issue really as long as there's more than one replica/shard.

Tied scores are broken by internal ID, specifying a
secondary sort should regularize things.

Best,
Erick

On Thu, Jul 23, 2015 at 10:01 AM, Tarala, Magesh <MTarala@bh.com> wrote:
> Erick,
> The 3 node cluster is setup to use 3 shards each with 1 replica. So, the index is split
on 3 servers.
>
> Another piece of info - I think the issue happens only when I use pagination. Verifying
if that's the case..
>
>
> Here's a query from the solr log on the server I'm pointing the query to:
>
> INFO  - 2015-07-23 16:56:09.683; org.apache.solr.core.SolrCore; [serviceorder_shard1_replica2]
webapp=/solr path=/select params={f.contact_facet.facet.sort=false&f.environment_facet.facet.mincount=1&facet=true&f.item_category1_hdr_facet.facet.sort=false&f.item_category3_hdr_facet.facet.sort=false&f.order_reason_facet.facet.sort=false&f.ac_model_facet.facet.mincount=1&f.employee_responsible_facet.facet.sort=false&f.header_category2_facet.facet.mincount=1&f.item_category3_hdr_facet.facet.mincount=1&f.ac_model_facet.facet.sort=false&f.environment_facet.facet.sort=false&f.header_status_facet.facet.mincount=1&f.howmal_facet.facet.mincount=1&fl=ac_model,contact,description,header_category2,header_status,id,notes_question_subject,notes_problem_description,notes_quick_response,order_reason,requested_start,serial_number,service_order,sold_to_party&f.operational_effect_facet.facet.sort=false&f.item_category2_hdr_facet.facet.sort=false&f.header_category2_facet.facet.sort=false&f.sold_to_party.facet.sort=false&facet.field=sold_to_party&facet.field=item_category3_hdr_facet&facet.field=item_category2_hdr_facet&facet.field=item_category1_hdr_facet&facet.field=operational_effect_facet&facet.field=when_discovered_facet&facet.field=environment_facet&facet.field=header_category2_facet&facet.field=header_status_facet&facet.field=howmal_facet&facet.field=order_reason_facet&facet.field=employee_responsible_facet&facet.field=contact_facet&facet.field=priority&facet.field=ac_model_facet&f.order_reason_facet.facet.mincount=1&f.howmal_facet.facet.sort=false&fq=requested_start:[+1991-01-01T00:00:00Z+TO+2015-07-23T23:59:59Z+]&fq=document_type:header&f.priority.facet.sort=false&f.header_status_facet.facet.sort=false&f.item_category1_hdr_facet.facet.mincount=1&f.sold_to_party.facet.mincount=1&f.operational_effect_facet.facet.mincount=1&f.priority.facet.mincount=1&rows=100&f.item_category2_hdr_facet.facet.mincount=1&f.employee_responsible_facet.facet.mincount=1&start=0&q={!type%3Dedismax+qf%3D'service_order^9+serial_number^9+material_hdr^9+description^8+notes_problem_description^7+notes_quick_response^6+notes_question_subject^5+notes_internal_note^4+notes_request_comments^3+notes_easa_aircarrier_notes^3+notes_apparent_cause^3+doc_content_hdr^2'+pf%3D'description~4^8+notes_problem_description~4^7+notes_quick_response~4^6+notes_question_subject~4^5+notes_internal_note~4^4+notes_request_comments~4^3+notes_easa_aircarrier_notes~4^3+notes_apparent_cause~4^3+doc_content_hdr~10^2'+pf2%3D'description~4^8+notes_problem_description~4^7+notes_quick_response~4^6+notes_question_subject~4^5+notes_internal_note~4^4+notes_request_comments~4^3+notes_easa_aircarrier_notes~4^3+notes_apparent_cause~4^3+doc_content_hdr~10^2'+pf3%3D'description~4^8+notes_problem_description~4^7+notes_quick_response~4^6+notes_question_subject~4^5+notes_internal_note~4^4+notes_request_comments~4^3+notes_easa_aircarrier_notes~4^3+notes_apparent_cause~4^3+doc_content_hdr~10^2'}driveshaft+corrosion&f.when_discovered_facet.facet.sort=fa
>
>
>
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Thursday, July 23, 2015 11:18 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Inconsistent Solr Search Results
>
> The query you're running would help. But here's a guess:
> You say you have a "3 node Solr cluster". By that I'm guessing you mean a single shard
with 1 leader and 2 replicas.
>
> when the primary sort criteria (score by default) is tied between two documents, the
internal Lucene doc ID is used as a tiebreaker. If you're doing something like a
> *:* query (asterisk:asterisk in case bolding happens) then the score for all docs is
the same. Here's the kicker:
> the internal Lucene doc id will be different in each replica.
> So my guess is you're getting results from different replicas where the internal doc
id is between docs has different relations.
>
> So I claim you don't get different results every time, you get
> 1 of three result orderings at a guess.
>
> If you have a decent size corpus and are searching by more interesting criteria this
should be much less of a problem, but still theoretically can happen. To nail down the ordering
completely, specify a secondary sort, as &sort=score,id
>
> Best,
> Erick
>
> On Thu, Jul 23, 2015 at 8:46 AM, Tarala, Magesh <MTarala@bh.com> wrote:
>> I have about 15K documents in a 3 node solr cluster. When I execute a simple search,
I get the results in different order every time I search. But the number of records is the
same. Here's the definition for the field.
>>
>> Any ideas, suggestions would be greatly appreciated.
>>
>>
>>     <fieldType name="text_en" class="solr.TextField"
>> positionIncrementGap="100">
>>
>>       <analyzer type="index">
>>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>>         <tokenizer class="solr.StandardTokenizerFactory"/>
>>         <filter class="solr.StopFilterFactory"
>>                 ignoreCase="true"
>>                 words="lang/stopwords_en.txt"
>>                 />
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>         <filter class="solr.EnglishPossessiveFilterFactory"/>
>>         <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
>>         <filter class="solr.PorterStemFilterFactory"/>
>>        <filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="50"/>
>>       </analyzer>
>>
>>       <analyzer type="query">
>>         <tokenizer class="solr.StandardTokenizerFactory"/>
>>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
>>         <filter class="solr.StopFilterFactory"
>>                 ignoreCase="true"
>>                 words="lang/stopwords_en.txt"
>>                 />
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>         <filter class="solr.EnglishPossessiveFilterFactory"/>
>>         <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
>>         <filter class="solr.PorterStemFilterFactory"/>
>>       </analyzer>
>>
>>     </fieldType>
>>
>>
>> Thanks,
>> Magesh
>>

Mime
View raw message