lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Not able to retrieve hits for a phrase
Date Mon, 17 Apr 2006 16:56:16 GMT
PhraseQuery needs terms that match what got indexed, simple as that.   
QueryParser does this for you by using the specified analyzer on the  
"phrase text" within double quotes and creating a PhraseQuery out of  
the tokens.  When you're creating a PhraseQuery directly with the  
API, you need to be aware of how things are indexed in order to  
ensure that any normalization, such as lowercasing, that occurs  
during indexing also occurs on the text you're searching with.

Most frequently, to search without case sensitivity the text is  
lowercased during indexing, and also during searching.   
StandardAnalyzer lowercases, as do almost all analyzers you'll find  
in the core (except WhiteSpaceAnalyzer).

	Erik


On Apr 17, 2006, at 11:33 AM, Vishal Bathija wrote:

> Hi Erik,
> Thanks, that seemed to have solved the problem. Can you please
> elaborate on the kind of input PhraseQuery takes in. Am I supposed to
> add only lowercased terms to PhraseQuery. Is it possible to search for
> a phrase that is not case sensitive?
>
> Regards
> Vishal
>
> On 4/17/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
>> Are the terms you're adding to PhraseQuery lowercased?  If not, then
>> that is most likely the issue.
>>
>>        Erik
>>
>>
>> On Apr 17, 2006, at 9:39 AM, Vishal Bathija wrote:
>>
>>> I currently use
>>> writer = new IndexWriter("index", new StandardAnalyzer(),true);
>>>
>>> Should I use any other analyzer. Yes I am aware that the matches are
>>> case sensitive.
>>>
>>> Regards
>>> Vishal
>>>
>>> On 4/17/06, Erik Hatcher <erik@ehatchersolutions.com> wrote:
>>>> This could be related to the analyzer you used during indexing.  Be
>>>> aware that matches are *exact* including case.
>>>>
>>>>        Erik
>>>>
>>>> On Apr 17, 2006, at 1:34 AM, Vishal Bathija wrote:
>>>>
>>>>> Hi,
>>>>> I am not able to retrieve the number of hits for a particular
>>>>> phrase .
>>>>> The code below retrieves the hits only for certain phrases. The  
>>>>> code
>>>>> snippet that I use is
>>>>>
>>>>> rd= IndexReader.open("C:\\Documents and Settings\\Owner\\My
>>>>> Documents\\Thesis\\luceneTest\\index");
>>>>> PhraseQuery query =new PhraseQuery();
>>>>> searcher = new IndexSearcher(rd);
>>>>> Term[] phrTerm=new Term[phraseTerms.length];
>>>>> for(int u=0; u<phraseTerms.length;u++)
>>>>>  {
>>>>>   phrTerm[u]=new Term("contents",phraseTerms[u]);
>>>>>  query.add(phrTerm[u]);
>>>>>   }
>>>>>
>>>>> System.out.println("Query"+query.toString() );
>>>>> Hits hits = searcher.search(query);
>>>>> System.out.println("Number of hits :"+hits.length());
>>>>>
>>>>> Number of hits is 0 for some phrases even though the phrase is
>>>>> present
>>>>> in some of the documents.
>>>>>
>>>>> This retrieves the hits for certain phrases such as
>>>>>
>>>>> "avoids deadlock" but it does not work for a phrase such as
>>>>> "Prevents Data Loss"
>>>>>
>>>>>
>>>>> I am not sure what the problem could be as none of these phrases
>>>>> have
>>>>> any special characters.  Do I need to use any other type of query?
>>>>>
>>>>>
>>>>> Regards
>>>>> Vishal
>>>>> --
>>>>> Vishal Bathija
>>>>> Graduate Student
>>>>> Department of Computer Science & Systems Analysis
>>>>> Miami University
>>>>> Oxford,Ohio
>>>>> Phone: (513)-461-9239
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Vishal Bathija
>>> Graduate Student
>>> Department of Computer Science & Systems Analysis
>>> Miami University
>>> Oxford,Ohio
>>> Phone: (513)-461-9239
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Vishal Bathija
> Graduate Student
> Department of Computer Science & Systems Analysis
> Miami University
> Oxford,Ohio
> Phone: (513)-461-9239
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message