lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Shards don't return documents in same order
Date Wed, 30 Apr 2014 20:36:57 GMT
Hmmm, take a look at the admin/analysis page for these inputs for
alphaOnlySort. If you're using the stock Solr distro, you're probably
not considering the effects patternReplaceFilterFactory which is
removing all non-letters. So these three terms reduce to

mba
mba
mbanew

You can look at the actual indexed terms by the admin/schema-browser as well.

That said, unless you transposed the order because you were
concentrating on the numeric part, the doc with MB20140410A-New should
always be sorting last.

All of which is irrelevant if you're doing something else with
"alphaOnlySort", so please paste in the fieldType definition if you've
changed it.

What gets returned in the doc for _stored_ data is a verbatim copy,
NOT the output of the analysis chain, which can be confusing.

Oh, and Solr uses the internal lucene doc ID to break ties, and docs
on different replicas can have different internal Lucene doc IDs
relative to each other as a result of merging so that's something else
to watch out for.

Best,
Erick

On Wed, Apr 30, 2014 at 1:06 PM, Francois Perron
<Francois.Perron@ticketmaster.com> wrote:
> Hi guys,
>
>   I have a small SolrCloud setup (3 servers, 1 collection with 1 shard and 3 replicat).
 In my schema, I have a alphaOnlySort field with a copyfield.
>
> This is a part of my managed-schema :
>
>     <field name="_root_" type="string" indexed="true" stored="false"/>
>     <field name="_uid" type="string" multiValued="false" indexed="true" required="true"
stored="true"/>
>     <field name="_version_" type="long" indexed="true" stored="true"/>
>     <field name="event_id" type="string" indexed="true" stored="true"/>
>     <field name="event_name" type="text_general" indexed="true" stored="true"/>
>     <field name="event_name_sort" type="alphaOnlySort"/>
>
> with the copyfield
>
>   <copyField source="event_name" dest="event_name_sort"/>
>
>
> The problem is : I query my collection with a sort on my alphasort field but on one of
my servers, the sort order is not the same.
>
> On server 1 and 2, I have this result :
>
> <doc>
> <str name="event_name">MB20140410A</str>
> </doc>
> <doc>
> <str name="event_name">MB20140410A-New</str>
> </doc>
> <doc>
> <str name="event_name">MB20140411A</str>
> </doc>
>
>
>
> and on the third one, this :
>
> <str name="event_name">MB20140410A</str>
> </doc>
> <doc>
> <str name="event_name">MB20140411A</str>
> </doc>
> <doc>
> <str name="event_name">MB20140410A-New</str>
> </doc>
>
>
> The doc named "MB20140411A" should be at the end ...
>
> Any idea ?
>
> Regards

Mime
View raw message