lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hall <mh...@informatics.jax.org>
Subject Re: No hits while searching!
Date Mon, 01 Jun 2009 15:58:54 GMT
Yeah, he's gotta be.

You might be better of using something like a lowercase analyzer here, 
since punctuation in a name is likely important.

Matt

Sudarsan, Sithu D. wrote:
>  
>
> Do you use stopword filtering?
>
> Sincerely,
> Sithu D Sudarsan
>
> -----Original Message-----
> From: vanshi [mailto:nilu.thakur@gmail.com] 
> Sent: Monday, June 01, 2009 11:39 AM
> To: java-user@lucene.apache.org
> Subject: Re: No hits while searching!
>
>
> Thanks Erick, I was able to get this work...as you said ..Luke is a
> great
> tool to look in to what gets stored as indexes though in my case I was
> searching before the indexes were created so i was getting zero hits.
>
> On side note, I'm running a strange output with prefix query...it only
> works
> when i have 3 or more than 3 letters in the first name/last name. Any
> idea
> what is going on here? Please see the output from log here.
>
> 02:05:20,996 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
> in
> PhysicianQuerybuilder with exactName=true
> 02:05:20,996 INFO  [PhysicianQueryBuilder] Before running Prefix query,
> First name: ang
> 02:05:20,996 INFO  [PhysicianQueryBuilder] Before running  Prefix query,
> Last name: john
> 02:05:21,012 INFO  [LuceneIndexService] the query is:
> +(FIRST_NAME_EXACT:ang*) +(LAST_NAME_EXACT:john*)
> 02:05:21,012 INFO  [LuceneIndexService] Result Size: 1
>
> 02:06:03,578 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
> in
> PhysicianQuerybuilder with exactName=true
> 02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
> First
> name: a
> 02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
> Last
> name: johns
> 02:06:03,578 INFO  [LuceneIndexService] the query is: +()
> +(LAST_NAME_EXACT:johns*)
> 02:06:03,578 INFO  [LuceneIndexService] Result Size: 0
>
> 02:08:01,548 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
> in
> PhysicianQuerybuilder with exactName=true
> 02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
> First
> name: an
> 02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
> Last
> name: johns
> 02:08:01,548 INFO  [LuceneIndexService] the query is: +()
> +(LAST_NAME_EXACT:johns*)
> 02:08:01,580 INFO  [LuceneIndexService] Result Size: 0
>
> As one can see the query works with first name=ang but not with first
> name=a
> or an.
>
> Appreciate all your inputs.
>
> Vanshi
>
> Erick Erickson wrote:
>   
>> The most common issue with this kind of thing is that
>>     
> UN_TOKENIZEDimplies
>   
>> no
>> case folding. So if your case differs you won't get a match.
>>
>> That aside, the very first thing I'd do is get a copy of Luke (google
>> Lucene
>> Luke)
>> and examine the index to see if what's in your index is what you
>>     
> *think*
>   
>> is
>> in there.
>>
>>
>> The second thing I'd do is look at query.toString() to see what the
>>     
> actual
>   
>>> query is. You can even paste the output of toString() into Luke and
>>>       
> see
>   
>>> what happens.
>>>       
>> I'm not sure what buildMultiTermPrefixQuery is all about, but I assume
>> you have a good reason for using that. But the other strategy I use
>>     
> for
>   
>> this kind of "what happened?" question is to peel back to simpler
>>     
> cases
>   
>> until I get what I expect, then build back up until it breaks.....
>>
>> But really get a copy of Luke, it's a wonderful tool that'll give you
>>     
> lots
>   
>> of
>> insight about what's *really* going on...
>>
>> Best
>> Erick
>>
>> On Wed, May 27, 2009 at 12:43 AM, vanshi <nilu.thakur@gmail.com>
>>     
> wrote:
>   
>>> In my web application, I need search functionality on first name and
>>>       
> last
>   
>>> name in 2 different ways, one search must be based on 'Metaphone
>>> Analyzer'
>>> giving all similar sounding names as result and another search should
>>>       
> be
>   
>>> exact match on either first name or last name. The name sounds like
>>> search
>>> has already been coded previously and I need to add another exact
>>>       
> match
>   
>>> search to the application. For this, I have a Lucene Index based out
>>>       
> on
>   
>>> fields from database tables which already had the names field indexed
>>> with
>>> metaphone analyzer. I added 2 more fields in the existing document,
>>>       
> which
>   
>>> indexes first name/last name as UN_TOKENIZED. While searching for
>>>       
> exact
>   
>>> match, I create a term query to look in to newly created UN_TOKENIZED
>>> fields
>>> as shown in the code snippets......however this is not getting any
>>>       
> hits.
>   
>>> I
>>> would like to know is there anything wrong conceptually?
>>>
>>> //creating fields for the document
>>> FIRST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
>>>                FIRST_NAME_EXACT(Field.Store.NO,
>>> Field.Index.UN_TOKENIZED),
>>>                LAST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
>>>                LAST_NAME_EXACT(Field.Store.NO,
>>>       
> Field.Index.UN_TOKENIZED),
>   
>>> //name sounds like analyzer class....used while Indexing and
>>>       
> searching
>   
>>> public class NameSoundsLikeAnalyzer extends Analyzer {
>>>        PerFieldAnalyzerWrapper wrapper;
>>>
>>>        /**
>>>         *
>>>         */
>>>        public NameSoundsLikeAnalyzer() {
>>>                wrapper = new PerFieldAnalyzerWrapper(new
>>>       
> StopAnalyzer());
>   
>>>                wrapper.addAnalyzer(
>>>
>>>  PhysicianDocumentBuilder.PhysicianFieldInfo.FIRST_NAME
>>>                                                .toString(), new
>>> MetaphoneReplacementAnalyzer());
>>>
>>>                wrapper.addAnalyzer(
>>>
>>>  PhysicianDocumentBuilder.PhysicianFieldInfo.LAST_NAME
>>>                                                .toString(), new
>>> MetaphoneReplacementAnalyzer());
>>>
>>>        }
>>>
>>>        /**
>>>         * @see PerFieldAnalyzerWrapper#tokenStream(String, Reader)
>>>         */
>>>        @Override
>>>        public TokenStream tokenStream(String fieldName, Reader
>>>       
> reader) {
>   
>>>                return wrapper.tokenStream(fieldName, reader);
>>>        }
>>>
>>> }
>>>
>>> //lastly the query builder
>>> if(physicianQuery.getExactNameSearch()){
>>>
>>>  if(StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith())){
>>>                                TermQuery term = new TermQuery(new
>>> Term(FIRST_NAME_EXACT.toString(),
>>> physicianQuery.getFirstNameStartsWith()));
>>>                                query.add(term,MUST);
>>>
>>>                        }
>>>
>>>  if(StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())){
>>>                                TermQuery term = new TermQuery(new
>>> Term(LAST_NAME_EXACT.toString(),
>>> physicianQuery.getLastNameStartsWith()));
>>>                                query.add(term,MUST);
>>>
>>>                        }
>>> else{
>>> //we want metaphone search
>>> if (StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith()))
>>>       
> {
>   
>>>  query.add(buildMultiTermPrefixQuery(FIRST_NAME.toString(),
>>>
>>>  physicianQuery.getFirstNameStartsWith()), MUST);
>>>                        }
>>>
>>>                        if
>>> (StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())) {
>>>
>>>  query.add(buildMultiTermPrefixQuery(LAST_NAME.toString(),
>>>
>>>  physicianQuery.getLastNameStartsWith()), MUST);
>>>                        }
>>> }
>>>
>>>
>>> --
>>> View this message in context:
>>>
>>>       
> http://www.nabble.com/No-hits-while-searching%21-tp23735920p23735920.htm
> l
>   
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>       
>>     
>
>   


-- 
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message