Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 71209 invoked from network); 17 Apr 2006 16:56:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 Apr 2006 16:56:53 -0000 Received: (qmail 53354 invoked by uid 500); 17 Apr 2006 16:56:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 53294 invoked by uid 500); 17 Apr 2006 16:56:41 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 53260 invoked by uid 99); 17 Apr 2006 16:56:41 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Apr 2006 09:56:41 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [69.55.225.129] (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Apr 2006 09:56:40 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id CBB1833C06A; Mon, 17 Apr 2006 12:56:19 -0400 (EDT) Received: from [128.143.193.79] (d-128-193-79.bootp.Virginia.EDU [128.143.193.79]) by ehatchersolutions.com (Postfix) with ESMTP id 9238533C049 for ; Mon, 17 Apr 2006 12:56:18 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v749.3) In-Reply-To: References: <532B4F7B-7E9F-4A13-BD53-4A17132FD0CB@ehatchersolutions.com> <5604FF82-50A1-4A78-A2C8-2E94C5B12F89@ehatchersolutions.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <66987F3B-A795-4E25-8D57-F1E8639229A8@ehatchersolutions.com> Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: Not able to retrieve hits for a phrase Date: Mon, 17 Apr 2006 12:56:16 -0400 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.749.3) X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on javelina X-Spam-Level: X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.1 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N PhraseQuery needs terms that match what got indexed, simple as that. QueryParser does this for you by using the specified analyzer on the "phrase text" within double quotes and creating a PhraseQuery out of the tokens. When you're creating a PhraseQuery directly with the API, you need to be aware of how things are indexed in order to ensure that any normalization, such as lowercasing, that occurs during indexing also occurs on the text you're searching with. Most frequently, to search without case sensitivity the text is lowercased during indexing, and also during searching. StandardAnalyzer lowercases, as do almost all analyzers you'll find in the core (except WhiteSpaceAnalyzer). Erik On Apr 17, 2006, at 11:33 AM, Vishal Bathija wrote: > Hi Erik, > Thanks, that seemed to have solved the problem. Can you please > elaborate on the kind of input PhraseQuery takes in. Am I supposed to > add only lowercased terms to PhraseQuery. Is it possible to search for > a phrase that is not case sensitive? > > Regards > Vishal > > On 4/17/06, Erik Hatcher wrote: >> Are the terms you're adding to PhraseQuery lowercased? If not, then >> that is most likely the issue. >> >> Erik >> >> >> On Apr 17, 2006, at 9:39 AM, Vishal Bathija wrote: >> >>> I currently use >>> writer = new IndexWriter("index", new StandardAnalyzer(),true); >>> >>> Should I use any other analyzer. Yes I am aware that the matches are >>> case sensitive. >>> >>> Regards >>> Vishal >>> >>> On 4/17/06, Erik Hatcher wrote: >>>> This could be related to the analyzer you used during indexing. Be >>>> aware that matches are *exact* including case. >>>> >>>> Erik >>>> >>>> On Apr 17, 2006, at 1:34 AM, Vishal Bathija wrote: >>>> >>>>> Hi, >>>>> I am not able to retrieve the number of hits for a particular >>>>> phrase . >>>>> The code below retrieves the hits only for certain phrases. The >>>>> code >>>>> snippet that I use is >>>>> >>>>> rd= IndexReader.open("C:\\Documents and Settings\\Owner\\My >>>>> Documents\\Thesis\\luceneTest\\index"); >>>>> PhraseQuery query =new PhraseQuery(); >>>>> searcher = new IndexSearcher(rd); >>>>> Term[] phrTerm=new Term[phraseTerms.length]; >>>>> for(int u=0; u>>>> { >>>>> phrTerm[u]=new Term("contents",phraseTerms[u]); >>>>> query.add(phrTerm[u]); >>>>> } >>>>> >>>>> System.out.println("Query"+query.toString() ); >>>>> Hits hits = searcher.search(query); >>>>> System.out.println("Number of hits :"+hits.length()); >>>>> >>>>> Number of hits is 0 for some phrases even though the phrase is >>>>> present >>>>> in some of the documents. >>>>> >>>>> This retrieves the hits for certain phrases such as >>>>> >>>>> "avoids deadlock" but it does not work for a phrase such as >>>>> "Prevents Data Loss" >>>>> >>>>> >>>>> I am not sure what the problem could be as none of these phrases >>>>> have >>>>> any special characters. Do I need to use any other type of query? >>>>> >>>>> >>>>> Regards >>>>> Vishal >>>>> -- >>>>> Vishal Bathija >>>>> Graduate Student >>>>> Department of Computer Science & Systems Analysis >>>>> Miami University >>>>> Oxford,Ohio >>>>> Phone: (513)-461-9239 >>>>> >>>>> ------------------------------------------------------------------ >>>>> -- >>>>> - >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>> >>>> >>>> ------------------------------------------------------------------- >>>> -- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>> >>>> >>> >>> >>> -- >>> Vishal Bathija >>> Graduate Student >>> Department of Computer Science & Systems Analysis >>> Miami University >>> Oxford,Ohio >>> Phone: (513)-461-9239 >>> >>> -------------------------------------------------------------------- >>> - >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > > -- > Vishal Bathija > Graduate Student > Department of Computer Science & Systems Analysis > Miami University > Oxford,Ohio > Phone: (513)-461-9239 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org