Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5A64E969F for ; Tue, 7 Feb 2012 17:21:26 +0000 (UTC) Received: (qmail 48054 invoked by uid 500); 7 Feb 2012 17:21:24 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 47977 invoked by uid 500); 7 Feb 2012 17:21:24 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 47962 invoked by uid 99); 7 Feb 2012 17:21:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 17:21:24 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 17:21:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4B7981A7152 for ; Tue, 7 Feb 2012 17:20:59 +0000 (UTC) Date: Tue, 7 Feb 2012 17:20:59 +0000 (UTC) From: "Dalius (Updated) (JIRA)" To: dev@lucene.apache.org Message-ID: <3173991.9168.1328635259310.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1324168863.9121.1328634779324.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (SOLR-3106) Wildcard ? issue MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/SOLR-3106?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Dalius updated SOLR-3106: ------------------------- Description:=20 Sorry for inaccurate title. I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containing = same value: {code} cal=E2=80=A2l=C3=ADgraf</title= > {code} and these fields are configured accordingly: {code} <fieldType name=3D"xml" class=3D"solr.TextField" positionIncrementGap= =3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.StandardTokenizerFactory"/> <filter class=3D"solr.ICUFoldingFilterFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.StandardTokenizerFactory"/> <filter class=3D"solr.ICUFoldingFilterFactory"/> </analyzer> </fieldType> =20 <fieldType name=3D"xml_unicode" class=3D"solr.TextField" positionIncrem= entGap=3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.StandardTokenizerFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> =20 <fieldType name=3D"xml_unicode_full" class=3D"solr.TextField" positionI= ncrementGap=3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> {code} And finally my search configuration: {code} <requestHandler name=3D"dictionary" class=3D"solr.SearchHandler"> <lst name=3D"defaults"> <str name=3D"echoParams">all</str> <str name=3D"defType">edismax</str> <str name=3D"mm">2<-25%</str> <str name=3D"qf">dc_title_unicode_full^2 dc_title_unicode^2 dc_t= itle</str> <int name=3D"rows">10</int> <str name=3D"spellcheck.onlyMorePopular">true</str> <str name=3D"spellcheck.extendedResults">false</str> <str name=3D"spellcheck.count">1</str> </lst> <arr name=3D"last-components"> <str>spellcheck</str> </arr> </requestHandler> {code} I am trying to match the field with various search phrases (that are valid)= . There are results: || \# || search phrase || match? || Comment || | 1 | cal=E2=80=A2l=C3=ADgra? | (/) | | | 2 | cal=E2=80=A2ligra? | (x) | Changed =C3=AD to i | | 3 | cal=E2=80=A2ligraf | (/) | | | 4 | calligra? | (/) | | The problem is the #2 attempt to match a data. The #3 works replacing ? wit= h f. One more thing. If * is used insted of ? other data is matched as cal=E2=80= =A2l=C3=ADgrafia but not cal=E2=80=A2l=C3=ADgraf... was: Sorry for inaccurate title. I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containing = same value: {code} <title xmlns=3D"http://www.tei-c.org/ns/1.0">cal=E2=80=A2l=C3=ADgraf</title= > {code} and these fields are configured accordingly: {code} <fieldType name=3D"xml" class=3D"solr.TextField" positionIncrementGap= =3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.StandardTokenizerFactory"/> <filter class=3D"solr.ICUFoldingFilterFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.StandardTokenizerFactory"/> <filter class=3D"solr.ICUFoldingFilterFactory"/> </analyzer> </fieldType> =20 <fieldType name=3D"xml_unicode" class=3D"solr.TextField" positionIncrem= entGap=3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.StandardTokenizerFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> =20 <fieldType name=3D"xml_unicode_full" class=3D"solr.TextField" positionI= ncrementGap=3D"100"> <analyzer type=3D"index"> <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> <analyzer type=3D"query"> <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> {code} And finally my search configuration: {code} <requestHandler name=3D"dictionary" class=3D"solr.SearchHandler"> <lst name=3D"defaults"> <str name=3D"echoParams">all</str> <str name=3D"defType">edismax</str> <str name=3D"mm">2<-25%</str> <str name=3D"qf">dc_title_unicode_full^2 dc_title_unicode^2 dc_t= itle</str> <int name=3D"rows">10</int> <str name=3D"spellcheck.onlyMorePopular">true</str> <str name=3D"spellcheck.extendedResults">false</str> <str name=3D"spellcheck.count">1</str> </lst> <arr name=3D"last-components"> <str>spellcheck</str> </arr> </requestHandler> {code} I am trying to match the field with various search phrases (that are valid)= . There are results: || # || search phrase || match? || | 1 | cal=E2=80=A2l=C3=ADgra? | (/) | | 2 | cal=E2=80=A2ligra? | (x) | | 3 | cal=E2=80=A2ligraf | (/) | | 4 | calligra? | (/) | The problem is the #2 attempt to match a data. The #3 works replacing ? wit= h f. One more thing. If * is used insted of ? other data is matched as cal=E2=80= =A2l=C3=ADgrafia but not cal=E2=80=A2l=C3=ADgraf... =20 > Wildcard ? issue > ---------------- > > Key: SOLR-3106 > URL: https://issues.apache.org/jira/browse/SOLR-3106 > Project: Solr > Issue Type: Bug > Affects Versions: 3.5 > Environment: Tomcat 7.0.25 (request encoding UTF-8) > Solr 3.5.0 > Java 7 Oracle > Ubuntu 11.10 > Reporter: Dalius > > Sorry for inaccurate title. > I have a 3 fields (dc_title, dc_title_unicode, dc_unicode_full) containin= g same value: > {code} > <title xmlns=3D"http://www.tei-c.org/ns/1.0">cal=E2=80=A2l=C3=ADgraf</tit= le> > {code} > and these fields are configured accordingly: > {code} > <fieldType name=3D"xml" class=3D"solr.TextField" positionIncrementGap= =3D"100"> > <analyzer type=3D"index"> > <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> > <tokenizer class=3D"solr.StandardTokenizerFactory"/> > <filter class=3D"solr.ICUFoldingFilterFactory"/> > </analyzer> > <analyzer type=3D"query"> > <tokenizer class=3D"solr.StandardTokenizerFactory"/> > <filter class=3D"solr.ICUFoldingFilterFactory"/> > </analyzer> > </fieldType> > =20 > <fieldType name=3D"xml_unicode" class=3D"solr.TextField" positionIncr= ementGap=3D"100"> > <analyzer type=3D"index"> > <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> > <tokenizer class=3D"solr.StandardTokenizerFactory"/> > </analyzer> > <analyzer type=3D"query"> > <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> > </analyzer> > </fieldType> > =20 > <fieldType name=3D"xml_unicode_full" class=3D"solr.TextField" positio= nIncrementGap=3D"100"> > <analyzer type=3D"index"> > <charFilter class=3D"solr.HTMLStripCharFilterFactory"/> > <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> > </analyzer> > <analyzer type=3D"query"> > <tokenizer class=3D"solr.WhitespaceTokenizerFactory"/> > </analyzer> > </fieldType> > {code} > And finally my search configuration: > {code} > <requestHandler name=3D"dictionary" class=3D"solr.SearchHandler"> > <lst name=3D"defaults"> > <str name=3D"echoParams">all</str> > <str name=3D"defType">edismax</str> > <str name=3D"mm">2<-25%</str> > <str name=3D"qf">dc_title_unicode_full^2 dc_title_unicode^2 dc= _title</str> > <int name=3D"rows">10</int> > <str name=3D"spellcheck.onlyMorePopular">true</str> > <str name=3D"spellcheck.extendedResults">false</str> > <str name=3D"spellcheck.count">1</str> > </lst> > <arr name=3D"last-components"> > <str>spellcheck</str> > </arr> > </requestHandler> > {code} > I am trying to match the field with various search phrases (that are vali= d). There are results: > || \# || search phrase || match? || Comment || > | 1 | cal=E2=80=A2l=C3=ADgra? | (/) | | > | 2 | cal=E2=80=A2ligra? | (x) | Changed =C3=AD to i | > | 3 | cal=E2=80=A2ligraf | (/) | | > | 4 | calligra? | (/) | | > The problem is the #2 attempt to match a data. The #3 works replacing ? w= ith f. > One more thing. If * is used insted of ? other data is matched as cal=E2= =80=A2l=C3=ADgrafia but not cal=E2=80=A2l=C3=ADgraf... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org