Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B37F610FCD for ; Fri, 2 May 2014 16:07:41 +0000 (UTC) Received: (qmail 71279 invoked by uid 500); 2 May 2014 16:07:37 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 71219 invoked by uid 500); 2 May 2014 16:07:36 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 71211 invoked by uid 99); 2 May 2014 16:07:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 May 2014 16:07:36 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.214.179 as permitted sender) Received: from [209.85.214.179] (HELO mail-ob0-f179.google.com) (209.85.214.179) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 May 2014 16:07:31 +0000 Received: by mail-ob0-f179.google.com with SMTP id vb8so5380159obc.24 for ; Fri, 02 May 2014 09:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=YN28321IpW36cAe4LCsBdq80kl9UFWN+9nkffirvvPw=; b=ni6lXrsRRLa5LUfyJ7jDApy3wFzFg6M3YlqX4C1GOOeccl4heFLD+qZY8VuB+8LLSW A0xt5e4zuXuT/A+jtbFDwO9PeJ26yjyHi0s90/hX+4FuDRtRr90Mic52osw8BcXGfxfY 0uTWFSeDeSy5nG9peD0GWhwkoaf3gYEEJYqtdIIf04RUsulq33YQl8XbqI2RWr43P3/B 9xb3D+PQz1ZYm15BNKccd5lsiD4wiqe8R6TjV0NM6rSMjFXdwz33WxcYVRjms50iffou RRJB2r7qn3vbfX+IJofW6A/OmF8dJ4zC8TyYQS95T+MUYGbMwIjzVkB++OzJzwkeNVRo u7rA== MIME-Version: 1.0 X-Received: by 10.60.179.80 with SMTP id de16mr1881488oec.69.1399046828616; Fri, 02 May 2014 09:07:08 -0700 (PDT) Received: by 10.76.75.105 with HTTP; Fri, 2 May 2014 09:07:08 -0700 (PDT) In-Reply-To: <085942DC39DB5B40AB3F0195C8E5DC8DEA96F9@USASHEXMB02.LYV.LiveNation.com> References: <085942DC39DB5B40AB3F0195C8E5DC8DEA95AB@USASHEXMB02.LYV.LiveNation.com> <085942DC39DB5B40AB3F0195C8E5DC8DEA96F9@USASHEXMB02.LYV.LiveNation.com> Date: Fri, 2 May 2014 09:07:08 -0700 Message-ID: Subject: Re: RE : Shards don't return documents in same order From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Francois: Yes, there are several means to examine the raw terms in the index. > The admin/schema-browser page > TermsComponent: https://cwiki.apache.org/confluence/display/solr/The+Term= s+Component > Luke the schema-browser is all set up for you, it's easiest. The TermsComponent should be directly usable too, I believe it's configured by default in solrconfig.xml Luke takes a bit of setup but is a great tool. Did you re-index from scratch on all shards? I presume your ordering is still not the same on all shards... the order I'd expect would be: mb20140410a mb20140410anew mb20140411a Best, Erick On Thu, May 1, 2014 at 8:27 AM, Francois Perron wrote: > Hi Erick, > > thank you for your response. You are right, I changed alphaOnlySort to= keep lettres and numbers and to remove some acticles (a, an, the). > > This is the filetype definition : > > > > > > > > > > > > Then, I tested each name with admin ui on each server and this is the res= ults : > > server1 > > MB20140410A =3D mb20140410a > MB20140411A =3D mb20140411a > MB20140410A-New =3D mb20140410anew > > server2 > > MB20140410A =3D mb20140410a > MB20140411A =3D mb20140411a > MB20140410A-New =3D mb20140410anew > > server3 > > MB20140410A =3D mb20140410a > MB20140411A =3D mb20140411a > MB20140410A-New =3D mb20140410anew > > "Unfortunately", all results are identical so is there a mean to view dat= a real indexed in these documents ? Can be a problem with a particular ser= ver ? All configs are in zookeeper so all cores shouldhave the same config= , right ? Is there any way to force a replicat to resynchronize ? > > Regards, > > Francois. > > ________________________________________ > De : Erick Erickson [erickerickson@gmail.com] > Envoy=C3=A9 : 30 avril 2014 16:36 > =C3=80 : solr-user@lucene.apache.org > Objet : Re: Shards don't return documents in same order > > Hmmm, take a look at the admin/analysis page for these inputs for > alphaOnlySort. If you're using the stock Solr distro, you're probably > not considering the effects patternReplaceFilterFactory which is > removing all non-letters. So these three terms reduce to > > mba > mba > mbanew > > You can look at the actual indexed terms by the admin/schema-browser as w= ell. > > That said, unless you transposed the order because you were > concentrating on the numeric part, the doc with MB20140410A-New should > always be sorting last. > > All of which is irrelevant if you're doing something else with > "alphaOnlySort", so please paste in the fieldType definition if you've > changed it. > > What gets returned in the doc for _stored_ data is a verbatim copy, > NOT the output of the analysis chain, which can be confusing. > > Oh, and Solr uses the internal lucene doc ID to break ties, and docs > on different replicas can have different internal Lucene doc IDs > relative to each other as a result of merging so that's something else > to watch out for. > > Best, > Erick > > On Wed, Apr 30, 2014 at 1:06 PM, Francois Perron > wrote: >> Hi guys, >> >> I have a small SolrCloud setup (3 servers, 1 collection with 1 shard a= nd 3 replicat). In my schema, I have a alphaOnlySort field with a copyfiel= d. >> >> This is a part of my managed-schema : >> >> >> >> >> >> >> >> >> with the copyfield >> >> >> >> >> The problem is : I query my collection with a sort on my alphasort field= but on one of my servers, the sort order is not the same. >> >> On server 1 and 2, I have this result : >> >> >> MB20140410A >> >> >> MB20140410A-New >> >> >> MB20140411A >> >> >> >> >> and on the third one, this : >> >> MB20140410A >> >> >> MB20140411A >> >> >> MB20140410A-New >> >> >> >> The doc named "MB20140411A" should be at the end ... >> >> Any idea ? >> >> Regards