lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vanshi <nilu.tha...@gmail.com>
Subject Re: No hits while searching!
Date Mon, 01 Jun 2009 17:31:05 GMT

Thanks Matt & sithu. Yes, It was due to stop word analyzer...now i'm using a
simple analyzer temporarily, as I know even simple analyzer cannot handle
quotes in names. However, can somebody plz direct me towards how to handle
quotes with the name in query using lowercase analyzer?

thanks,
Vanshi

Matthew Hall-7 wrote:
> 
> Yeah, he's gotta be.
> 
> You might be better of using something like a lowercase analyzer here, 
> since punctuation in a name is likely important.
> 
> Matt
> 
> Sudarsan, Sithu D. wrote:
>>  
>>
>> Do you use stopword filtering?
>>
>> Sincerely,
>> Sithu D Sudarsan
>>
>> -----Original Message-----
>> From: vanshi [mailto:nilu.thakur@gmail.com] 
>> Sent: Monday, June 01, 2009 11:39 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: No hits while searching!
>>
>>
>> Thanks Erick, I was able to get this work...as you said ..Luke is a
>> great
>> tool to look in to what gets stored as indexes though in my case I was
>> searching before the indexes were created so i was getting zero hits.
>>
>> On side note, I'm running a strange output with prefix query...it only
>> works
>> when i have 3 or more than 3 letters in the first name/last name. Any
>> idea
>> what is going on here? Please see the output from log here.
>>
>> 02:05:20,996 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
>> in
>> PhysicianQuerybuilder with exactName=true
>> 02:05:20,996 INFO  [PhysicianQueryBuilder] Before running Prefix query,
>> First name: ang
>> 02:05:20,996 INFO  [PhysicianQueryBuilder] Before running  Prefix query,
>> Last name: john
>> 02:05:21,012 INFO  [LuceneIndexService] the query is:
>> +(FIRST_NAME_EXACT:ang*) +(LAST_NAME_EXACT:john*)
>> 02:05:21,012 INFO  [LuceneIndexService] Result Size: 1
>>
>> 02:06:03,578 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
>> in
>> PhysicianQuerybuilder with exactName=true
>> 02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
>> First
>> name: a
>> 02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
>> Last
>> name: johns
>> 02:06:03,578 INFO  [LuceneIndexService] the query is: +()
>> +(LAST_NAME_EXACT:johns*)
>> 02:06:03,578 INFO  [LuceneIndexService] Result Size: 0
>>
>> 02:08:01,548 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
>> in
>> PhysicianQuerybuilder with exactName=true
>> 02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
>> First
>> name: an
>> 02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
>> Last
>> name: johns
>> 02:08:01,548 INFO  [LuceneIndexService] the query is: +()
>> +(LAST_NAME_EXACT:johns*)
>> 02:08:01,580 INFO  [LuceneIndexService] Result Size: 0
>>
>> As one can see the query works with first name=ang but not with first
>> name=a
>> or an.
>>
>> Appreciate all your inputs.
>>
>> Vanshi
>>
>> Erick Erickson wrote:
>>   
>>> The most common issue with this kind of thing is that
>>>     
>> UN_TOKENIZEDimplies
>>   
>>> no
>>> case folding. So if your case differs you won't get a match.
>>>
>>> That aside, the very first thing I'd do is get a copy of Luke (google
>>> Lucene
>>> Luke)
>>> and examine the index to see if what's in your index is what you
>>>     
>> *think*
>>   
>>> is
>>> in there.
>>>
>>>
>>> The second thing I'd do is look at query.toString() to see what the
>>>     
>> actual
>>   
>>>> query is. You can even paste the output of toString() into Luke and
>>>>       
>> see
>>   
>>>> what happens.
>>>>       
>>> I'm not sure what buildMultiTermPrefixQuery is all about, but I assume
>>> you have a good reason for using that. But the other strategy I use
>>>     
>> for
>>   
>>> this kind of "what happened?" question is to peel back to simpler
>>>     
>> cases
>>   
>>> until I get what I expect, then build back up until it breaks.....
>>>
>>> But really get a copy of Luke, it's a wonderful tool that'll give you
>>>     
>> lots
>>   
>>> of
>>> insight about what's *really* going on...
>>>
>>> Best
>>> Erick
>>>
>>> On Wed, May 27, 2009 at 12:43 AM, vanshi <nilu.thakur@gmail.com>
>>>     
>> wrote:
>>   
>>>> In my web application, I need search functionality on first name and
>>>>       
>> last
>>   
>>>> name in 2 different ways, one search must be based on 'Metaphone
>>>> Analyzer'
>>>> giving all similar sounding names as result and another search should
>>>>       
>> be
>>   
>>>> exact match on either first name or last name. The name sounds like
>>>> search
>>>> has already been coded previously and I need to add another exact
>>>>       
>> match
>>   
>>>> search to the application. For this, I have a Lucene Index based out
>>>>       
>> on
>>   
>>>> fields from database tables which already had the names field indexed
>>>> with
>>>> metaphone analyzer. I added 2 more fields in the existing document,
>>>>       
>> which
>>   
>>>> indexes first name/last name as UN_TOKENIZED. While searching for
>>>>       
>> exact
>>   
>>>> match, I create a term query to look in to newly created UN_TOKENIZED
>>>> fields
>>>> as shown in the code snippets......however this is not getting any
>>>>       
>> hits.
>>   
>>>> I
>>>> would like to know is there anything wrong conceptually?
>>>>
>>>> //creating fields for the document
>>>> FIRST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
>>>>                FIRST_NAME_EXACT(Field.Store.NO,
>>>> Field.Index.UN_TOKENIZED),
>>>>                LAST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
>>>>                LAST_NAME_EXACT(Field.Store.NO,
>>>>       
>> Field.Index.UN_TOKENIZED),
>>   
>>>> //name sounds like analyzer class....used while Indexing and
>>>>       
>> searching
>>   
>>>> public class NameSoundsLikeAnalyzer extends Analyzer {
>>>>        PerFieldAnalyzerWrapper wrapper;
>>>>
>>>>        /**
>>>>         *
>>>>         */
>>>>        public NameSoundsLikeAnalyzer() {
>>>>                wrapper = new PerFieldAnalyzerWrapper(new
>>>>       
>> StopAnalyzer());
>>   
>>>>                wrapper.addAnalyzer(
>>>>
>>>>  PhysicianDocumentBuilder.PhysicianFieldInfo.FIRST_NAME
>>>>                                                .toString(), new
>>>> MetaphoneReplacementAnalyzer());
>>>>
>>>>                wrapper.addAnalyzer(
>>>>
>>>>  PhysicianDocumentBuilder.PhysicianFieldInfo.LAST_NAME
>>>>                                                .toString(), new
>>>> MetaphoneReplacementAnalyzer());
>>>>
>>>>        }
>>>>
>>>>        /**
>>>>         * @see PerFieldAnalyzerWrapper#tokenStream(String, Reader)
>>>>         */
>>>>        @Override
>>>>        public TokenStream tokenStream(String fieldName, Reader
>>>>       
>> reader) {
>>   
>>>>                return wrapper.tokenStream(fieldName, reader);
>>>>        }
>>>>
>>>> }
>>>>
>>>> //lastly the query builder
>>>> if(physicianQuery.getExactNameSearch()){
>>>>
>>>>  if(StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith())){
>>>>                                TermQuery term = new TermQuery(new
>>>> Term(FIRST_NAME_EXACT.toString(),
>>>> physicianQuery.getFirstNameStartsWith()));
>>>>                                query.add(term,MUST);
>>>>
>>>>                        }
>>>>
>>>>  if(StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())){
>>>>                                TermQuery term = new TermQuery(new
>>>> Term(LAST_NAME_EXACT.toString(),
>>>> physicianQuery.getLastNameStartsWith()));
>>>>                                query.add(term,MUST);
>>>>
>>>>                        }
>>>> else{
>>>> //we want metaphone search
>>>> if (StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith()))
>>>>       
>> {
>>   
>>>>  query.add(buildMultiTermPrefixQuery(FIRST_NAME.toString(),
>>>>
>>>>  physicianQuery.getFirstNameStartsWith()), MUST);
>>>>                        }
>>>>
>>>>                        if
>>>> (StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())) {
>>>>
>>>>  query.add(buildMultiTermPrefixQuery(LAST_NAME.toString(),
>>>>
>>>>  physicianQuery.getLastNameStartsWith()), MUST);
>>>>                        }
>>>> }
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>>       
>> http://www.nabble.com/No-hits-while-searching%21-tp23735920p23735920.htm
>> l
>>   
>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>       
>>>     
>>
>>   
> 
> 
> -- 
> Matthew Hall
> Software Engineer
> Mouse Genome Informatics
> mhall@informatics.jax.org
> (207) 288-6012
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/No-hits-while-searching%21-tp23735920p23818803.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message