Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 62451 invoked from network); 3 Feb 2010 15:07:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Feb 2010 15:07:48 -0000 Received: (qmail 16673 invoked by uid 500); 3 Feb 2010 15:07:46 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 16613 invoked by uid 500); 3 Feb 2010 15:07:46 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 16594 invoked by uid 99); 3 Feb 2010 15:07:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Feb 2010 15:07:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of java8964@hotmail.com designates 65.55.111.93 as permitted sender) Received: from [65.55.111.93] (HELO blu0-omc2-s18.blu0.hotmail.com) (65.55.111.93) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Feb 2010 15:07:36 +0000 Received: from BLU140-W21 ([65.55.111.71]) by blu0-omc2-s18.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 3 Feb 2010 07:07:15 -0800 Message-ID: Content-Type: multipart/alternative; boundary="_176fa4a8-ec6e-480f-a141-e13dc57d50c9_" X-Originating-IP: [208.48.154.67] From: java8964 java8964 To: Subject: RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case? Date: Wed, 3 Feb 2010 10:07:15 -0500 Importance: Normal In-Reply-To: <001e01caa4ba$1668a6f0$4339f4d0$@de> References: <27348933.post@talk.nabble.com> <359a92831001271815y532f5007le8b89845a5d7f8e1@mail.gmail.com> <3b23ce091001272051g2ef79c7fsde1e835955b1dcad@mail.gmail.com> <27406592.post@talk.nabble.com> <000c01caa35d$5b071fc0$11155f40$@de> <000f01caa36d$7927ba90$6b772fb0$@com> <8c4e68611002030206x6591a9d1je24801f3237173a2@mail.gmail.com>,<001e01caa4ba$1668a6f0$4339f4d0$@de> MIME-Version: 1.0 X-OriginalArrivalTime: 03 Feb 2010 15:07:15.0331 (UTC) FILETIME=[920C1530:01CAA4E2] --_176fa4a8-ec6e-480f-a141-e13dc57d50c9_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Thanks for your help. My concern now is that the field could be defined as store. So when the use= r receive the field data=2C we want to still show the original data=2C in u= pper case in this case. First=2C I don't think I can use queryParser.SetLowercaseExpandedTerms(fals= e)=2C which will remove the wildcard search case insensitive functionality = for tokenized field. To handle this case=2C if the data is NOT tokenized=2C but contain upper ca= se data=2C to be able do the wildcard search with uppercase letter=2C like = 'BB*'=2C I am thinking that I have to analyzer the non tokenized data=2C us= ing a KeywordTokenizer plus lowercase the data. For your suggestion=2C will the data change to lower case and stored in the= lucene when it being retrieved? Thanks > From: uwe@thetaphi.de > To: java-user@lucene.apache.org > Subject: RE: During the wild card search=2C will lucene 2.9.0 to convert = the search string to lower case? > Date: Wed=2C 3 Feb 2010 11:17:27 +0100 >=20 > For specific fields using a special TokenStream chain=2C there is no need= to write a separate analyzer. You can add fields to a document using a Tok= enStream as parameter: new Field(name=2C TokenStream). >=20 > As TokenStream just create a chain from Tokenizer and all Filters like: >=20 > TokenStream ts =3D new KeywordTokenizer(new StringReader("your text to in= dex"))=3B > ts =3D new LowercaseFilter(ts)=3B > ... > document.add("fieldname"=2C ts)=3B >=20 > ----- > Uwe Schindler > H.-H.-Meier-Allee 63=2C D-28213 Bremen > http://www.thetaphi.de > eMail: uwe@thetaphi.de >=20 >=20 > > -----Original Message----- > > From: Ian Lea [mailto:ian.lea@gmail.com] > > Sent: Wednesday=2C February 03=2C 2010 11:06 AM > > To: java-user@lucene.apache.org > > Subject: Re: During the wild card search=2C will lucene 2.9.0 to conver= t > > the search string to lower case? > >=20 > > I think you'll have to write your own. Or just downcase the text > > yourself first. > >=20 > >=20 > > -- > > Ian. > >=20 > >=20 > > On Tue=2C Feb 2=2C 2010 at 9:30 PM=2C java8964 java8964 > > wrote: > > > > > > Is there an analyzer like keyword analyzer=2C but will also lowering > > the data from lucene? Or I have to do a customer analyzer by myself? > > > > > > Thanks > > > > > >> From: java8964@hotmail.com > > >> To: java-user@lucene.apache.org > > >> Subject: RE: During the wild card search=2C will lucene 2.9.0 to > > convert the search string to lower case? > > >> Date: Mon=2C 1 Feb 2010 14:24:00 -0500 > > >> > > >> > > >> This is maybe something I am looking for. We are using the default > > value=2C which is true. > > >> > > >> Let me examine this method more. > > >> > > >> Thanks for your help. > > >> > > >> > From: digydigy@gmail.com > > >> > To: java-user@lucene.apache.org > > >> > Subject: RE: During the wild card search=2C will lucene 2.9.0 to > > convert the search string to lower case? > > >> > Date: Mon=2C 1 Feb 2010 20:36:29 +0200 > > >> > > > >> > Did you try queryParser.SetLowercaseExpandedTerms(false)? > > >> > > > >> > DIGY > > >> > > > >> > -----Original Message----- > > >> > From: java8964 java8964 [mailto:java8964@hotmail.com] > > >> > Sent: Monday=2C February 01=2C 2010 8:11 PM > > >> > To: java-user@lucene.apache.org > > >> > Subject: RE: During the wild card search=2C will lucene 2.9.0 to > > convert the > > >> > search string to lower case? > > >> > > > >> > > > >> > I would like to confirm your reply. You mean that the query parse > > will lower > > >> > casing. In fact=2C it looks like that it only does this for wild > > card query=2C > > >> > right? > > >> > > > >> > For the term query=2C it didn't. As proved by if you change the li= ne > > to: > > >> > > > >> > Query query =3D new QueryParser("title"=2C > > >> > wrapper).parse("title:\"BBB CCC\"")=3B > > >> > > > >> > You will get 1 hits back. So in this case=2C the query parser clas= s > > did in > > >> > different way for term query and wild card query. > > >> > > > >> > We have to use the query parse in this case=2C but we have our own > > Query > > >> > parser class extends from the lucene query parser class. Anything > > we can do > > >> > to about it? > > >> > > > >> > Will lucense's query parser class be fixed for the above > > inconsistent > > >> > implementation? > > >> > > > >> > Thanks > > >> > > > >> > > > >> > > From: uwe@thetaphi.de > > >> > > To: java-user@lucene.apache.org > > >> > > Subject: RE: During the wild card search=2C will lucene 2.9.0 to > > convert the > > >> > search string to lower case? > > >> > > Date: Mon=2C 1 Feb 2010 17:41:08 +0100 > > >> > > > > >> > > Only query parser does the lower casing. For such a special > > case=2C I would > > >> > suggest to use a PrefixQuery or WildcardQuery directly and not use > > query > > >> > parser. > > >> > > > > >> > > ----- > > >> > > Uwe Schindler > > >> > > H.-H.-Meier-Allee 63=2C D-28213 Bremen > > >> > > http://www.thetaphi.de > > >> > > eMail: uwe@thetaphi.de > > >> > > > > >> > > > -----Original Message----- > > >> > > > From: java8964 java8964 [mailto:java8964@hotmail.com] > > >> > > > Sent: Monday=2C February 01=2C 2010 5:27 PM > > >> > > > To: java-user@lucene.apache.org > > >> > > > Subject: During the wild card search=2C will lucene 2.9.0 to > > convert the > > >> > > > search string to lower case? > > >> > > > > > >> > > > > > >> > > > I noticed a strange result from the following test case. For > > wildcard > > >> > > > search=2C my understanding is that lucene will NOT use any > > analyzer on > > >> > > > the query string. But as the following simple code to show=2C = it > > looks > > >> > > > like that lucene will lower case the search query in the > > wildcard > > >> > > > search. Why? If not=2C why the following test case show the > > search hits > > >> > > > as one for lower case wildcard search=2C but not for the upper > > case data? > > >> > > > My original data is NOT analyzed=2C so they should be stored a= s > > the > > >> > > > original data in the index segment=2C right? > > >> > > > > > >> > > > Lucene version: 2.9.0 > > >> > > > > > >> > > > JDK version: JDK 1.6.0_17 > > >> > > > > > >> > > > > > >> > > > public class IndexTest1 { > > >> > > > public static void main(String[] args) { > > >> > > > try { > > >> > > > Directory directory =3D new RAMDirectory()=3B > > >> > > > IndexWriter writer =3D new IndexWriter(directory= =2C > > new > > >> > > > StandardAnalyzer(Version.LUCENE_CURRENT)=2C > > >> > > > IndexWriter.MaxFieldLength.UNLIMITED)=3B > > >> > > > Document doc =3D new Document()=3B > > >> > > > doc.add(new Field("title"=2C "BBB CCC"=2C > > Field.Store.YES=2C > > >> > > > Field.Index.NOT_ANALYZED))=3B > > >> > > > writer.addDocument(doc)=3B > > >> > > > doc =3D new Document()=3B > > >> > > > doc.add(new Field("title"=2C "ddd eee"=2C > > Field.Store.YES=2C > > >> > > > Field.Index.NOT_ANALYZED))=3B > > >> > > > writer.addDocument(doc)=3B > > >> > > > > > >> > > > writer.close()=3B > > >> > > > > > >> > > > IndexSearcher searcher =3D new > > IndexSearcher(directory=2C > > >> > > > true)=3B > > >> > > > PerFieldAnalyzerWrapper wrapper =3D new > > >> > > > PerFieldAnalyzerWrapper(new > > StandardAnalyzer(Version.LUCENE_CURRENT))=3B > > >> > > > wrapper.addAnalyzer("title"=2C new > > KeywordAnalyzer())=3B > > >> > > > Query query =3D new QueryParser("title"=2C > > >> > > > wrapper).parse("title:BBB*")=3B > > >> > > > System.out.println("hits of title =3D " + > > >> > > > searcher.search(query=2C 100).totalHits)=3B > > >> > > > query =3D new QueryParser("title"=2C > > >> > > > wrapper).parse("title:ddd*")=3B > > >> > > > System.out.println("hits of title =3D " + > > >> > > > searcher.search(query=2C 100).totalHits)=3B > > >> > > > searcher.close()=3B > > >> > > > } catch (Exception e) { > > >> > > > System.out.println(e)=3B > > >> > > > } > > >> > > > } > > >> > > > } > > >> > > > > > >> > > > The output: > > >> > > > hits of title =3D 0 > > >> > > > hits of title =3D 1 > > >> > > > > > >> > > > > > >> > > > > > _________________________________________________________________ > > >> > > > Hotmail: Trusted email with powerful SPAM protection. > > >> > > > http://clk.atdmt.com/GBL/go/201469227/direct/01/ > > >> > > > > >> > > > > >> > > ---------------------------------------------------------------- > > ----- > > >> > > To unsubscribe=2C e-mail: java-user-unsubscribe@lucene.apache.or= g > > >> > > For additional commands=2C e-mail: java-user- > > help@lucene.apache.org > > >> > > > > >> > > > >> > _________________________________________________________________ > > >> > Hotmail: Powerful Free email with security by Microsoft. > > >> > http://clk.atdmt.com/GBL/go/201469230/direct/01/ > > >> > > > >> > > > >> > ------------------------------------------------------------------ > > --- > > >> > To unsubscribe=2C e-mail: java-user-unsubscribe@lucene.apache.org > > >> > For additional commands=2C e-mail: java-user-help@lucene.apache.or= g > > >> > > > >> > > >> _________________________________________________________________ > > >> Hotmail: Trusted email with Microsoft=92s powerful SPAM protection. > > >> http://clk.atdmt.com/GBL/go/201469226/direct/01/ > > > > > > _________________________________________________________________ > > > Hotmail: Powerful Free email with security by Microsoft. > > > http://clk.atdmt.com/GBL/go/201469230/direct/01/ > >=20 > > --------------------------------------------------------------------- > > To unsubscribe=2C e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands=2C e-mail: java-user-help@lucene.apache.org >=20 >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe=2C e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands=2C e-mail: java-user-help@lucene.apache.org >=20 =20 _________________________________________________________________ Hotmail: Powerful Free email with security by Microsoft. http://clk.atdmt.com/GBL/go/201469230/direct/01/= --_176fa4a8-ec6e-480f-a141-e13dc57d50c9_--