Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8A8A47672 for ; Wed, 9 Nov 2011 17:33:33 +0000 (UTC) Received: (qmail 86219 invoked by uid 500); 9 Nov 2011 17:33:30 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 86176 invoked by uid 500); 9 Nov 2011 17:33:30 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 86167 invoked by uid 99); 9 Nov 2011 17:33:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2011 17:33:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ey0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2011 17:33:23 +0000 Received: by eyh5 with SMTP id 5so2958060eyh.35 for ; Wed, 09 Nov 2011 09:33:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=i3gysIUv0yF5Tsh41w7jZLVMB1tSREBY6NTS/itPKwM=; b=PO4QlIPAdTUUjCqE4S4df8cKadwYFTZB7Ee9tGZ4WHe+n+y1sI0C4S8qCk2GhAY7Q5 Ysvfj5Q56YXjRBJwlJ6y4JynSgkcimyt90+4E2PwqOlxOwJKHt6gvhLO0/1eY5nqMR4J +fNyQLNcUjdx5sJ8jta/nxmGnyYYtQgs83STk= MIME-Version: 1.0 Received: by 10.182.86.200 with SMTP id r8mr1071862obz.41.1320859982695; Wed, 09 Nov 2011 09:33:02 -0800 (PST) Received: by 10.182.182.66 with HTTP; Wed, 9 Nov 2011 09:33:02 -0800 (PST) In-Reply-To: <000001cc9e88$e7633080$b6299180$@com> References: <000701cc9dbe$466baab0$d3430010$@com> <000001cc9e88$e7633080$b6299180$@com> Date: Wed, 9 Nov 2011 12:33:02 -0500 Message-ID: Subject: Re: Weird: Solr Search result and Analysis Result not match? From: Erick Erickson To: solr-user@lucene.apache.org, elleryleung@be-o.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Regarding <1>. Take a look at admin/analysis and see the tokenization just to check. Oh, and one more thing... putting in front of kind of defeats the purpose of WordDelimiterFilterFactory. One of the things WDDF does is split on case change and you're removing the case changes before WDDF gets hold of it. Best Erick On Tue, Nov 8, 2011 at 9:40 PM, Ellery Leung wrote: > Thanks Erick, here are my responses: > > 1. Yes. =C2=A0What I want to achieve is that when index is filtered with = EdgeNgram, and a query that is not filtered in that way, I can do search on= partial string. > 2. Good suggestion, will test it. > 3. ok > 4. Thank you > 5/6. Will remove the synonyms and word delimiterfilterfactory in query > 7. will look at that using Luke. =C2=A0By the way, it is the first time I= saw that there is a tool for that. =C2=A0Thank you. > 8. Yes. > > Will check that again, thank you. > > -----Original Message----- > From: Erick Erickson [mailto:erickerickson@gmail.com] > Sent: 2011=E5=B9=B411=E6=9C=888=E6=97=A5 9:52 =E4=B8=8B=E5=8D=88 > To: solr-user@lucene.apache.org; elleryleung@be-o.com > Subject: Re: Weird: Solr Search result and Analysis Result not match? > > Several things: > > 1> You don't have EdgeNGramFilterFactory in your query analysis chain, > is this intentional? > 2> You have a LOT of stuff going on here, you might try making your > analysis chain simpler and > =C2=A0 =C2=A0 adding stuff back in until you see the error. Don't forget = to re-index! > 3> Analysis doesn't take into account query *parsing*, so it's > possible to get a false sense of > =C2=A0 =C2=A0 assurance when the analysis page matches your expectations. > 4> Even though nothing jumps out at me except the Edge.... factory, > nice job of including > =C2=A0 =C2=A0 information. > 5> It's unusual to expand synonyms both at query and index time, > usually one or the > =C2=A0 =C2=A0 other with index time preferred. > 6> Same with WordDelimiterFilterFactory. If you put all the variants > in the index, you don't > =C2=A0 =C2=A0 need to put all the variants in the query and vice-versa. > 7> Take a look at your actual contents, perhaps using Luke to insure > that what you expect > =C2=A0 =C2=A0 =C2=A0to be in your index actually is. > 8> You did re-index after your latest changes to your schema, right ? > > All of this is a way of saying that I don't quite see what the problem > is, but at least there are > some avenues to explore. > > Best > Erick > > On Mon, Nov 7, 2011 at 9:29 PM, Ellery Leung wrote= : >> Hi all. >> >> >> >> I am using Solr 3.4 under Win 7. >> >> >> >> In schema there is a multivalue field indexed in this way: >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> Schema: >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> > stored=3D"true" omitNorms=3D"true"/> >> >> >> >> > positionIncrementGap=3D"100"> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> mapping=3D"../../filters/filter-mappings.txt"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> synonyms=3D"../../filters/filter-synonyms.txt" ignoreCase=3D"true" >> expand=3D"true"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> splitOnCaseChange=3D"1" splitOnNumerics=3D"1" stemEnglishPossessive=3D"1= " >> generateWordParts=3D"1" generateNumberParts=3D"1" catenateWords=3D"1" >> catenateNumbers=3D"1" catenateAll=3D"0" preserveOriginal=3D"1"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> encoder=3D"DoubleMetaphone" inject=3D"true"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> maxGramSize=3D"50" side=3D"front"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> mapping=3D"../../filters/filter-mappings.txt"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> synonyms=3D"../../filters/filter-synonyms.txt" ignoreCase=3D"true" >> expand=3D"true"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> splitOnCaseChange=3D"1" splitOnNumerics=3D"1" stemEnglishPossessive=3D"1= " >> generateWordParts=3D"0" generateNumberParts=3D"1" catenateWords=3D"1" >> catenateNumbers=3D"1" catenateAll=3D"0" preserveOriginal=3D"1"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0> encoder=3D"DoubleMetaphone"/> >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 >> >> >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> Actual index: >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> >> >> 2284e2 >> >> 2284e4 >> >> 2284e5 >> >> 1911e2 >> >> >> >> >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> Question: >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D >> >> Now when I do a search like this: >> >> >> >> myEvent:1911e2 >> >> >> >> This should match the 4th item. =C2=A0Now on "Full Interface", it does n= ot return >> any result. =C2=A0But on "analysis", matches are highlighted. >> >> >> >> By using Debug: the parsedquery is: >> >> >> >> MultiPhraseQuery(myEvent:"(1911e2 1911) (A e) 2") >> >> >> >> Parsedquery_toString: >> >> >> >> myEvent:"(1911e2 1911) (A e) 2" >> >> >> >> Can anyone please help me on this? >> >> > >