lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian.huang" <yiwong2...@hotmail.com>
Subject Re: about analyzer for searching location
Date Mon, 19 Apr 2010 10:03:07 GMT
Does a token of "united states" exist in index if using standard analyzer. 
My understanding is, united and states are separately stored in index, but 
not as "united states". So, if I build a query like Query q = 
qp.parse("\"united states\""); It would not return any result. Am I right?

Ian

--------------------------------------------------
From: "Samarendra Pratap" <samarzone@gmail.com>
Sent: Friday, April 16, 2010 9:02 PM
To: <java-user@lucene.apache.org>
Subject: Re: about analyzer for searching location

> Hi. I don't think you need a different analyzer. Read about
> PhraseQuery<http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/PhraseQuery.html>.
> If you are using parse() method of QueryParser. Enclose the searched 
> string
> in extra double quotes, which must obviously be escaped.
>
> Query q = qp.parse("\"united states\"");
>
>
> 2010/4/15 Ian.huang <yiwong2001@hotmail.com>
>
>> Hi All,
>>
>> I am implementing a search function for address by hibernate search which
>> is based on lucene. The class definition as following:
>>
>> @Indexed
>> public class Address implements Cloneable
>> {
>> @DocumentId
>> private int id;
>> @Field
>> private String addrCountry;
>> private String addrDesc;
>> @Field
>> private String addrLineOne;
>> private String addrLineTwo;
>> @Field
>> private String addrCity;
>> ......
>>
>> As you see, addrCountry, addrLineone and addrCity are fields for search. 
>> I
>> am using default analyzer in index & search. So I think country name like
>> United States would be indexed as two terms United, and states.
>>
>> In addition, during search, a search keyword like United states, or Salt
>> lake city would be tokenized as two or three single words.
>>
>> As result, any address fields contain united, city would be returned. 
>> like
>> United Kingdom, but actually I want to get a result of united states.
>>
>> My expected result as following:
>>
>> if someone searches for "united" it should return "united states" and
>> "united kingdom".
>>
>> if someone searches for "united states" it should return "united states",
>> and not "united kingdom".
>>
>> I hope the analyzer can generate term with multiple words. say, united
>> states to united states. I think standardanalyzer would analyze united
>> states to united and states?
>>
>> A different example: if search keyword is parking lot in Salt Lake City,
>> the generated terms to search need to be: parking lot and Salt Lake City,
>> not parking,lot,salt,lake and city.
>>
>> I wonder if any analyzer can help me to implement my requirement. It 
>> would
>> be better to use dictionary based solution, then I can manage some search
>> terms that could have multiple words.
>>
>> thanks
>>
>> Ian
>
>
>
>
> -- 
> Regards,
> Samar
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message