lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jayakumar.V" <>
Subject RE: Issue with sounds-like queries
Date Wed, 28 Sep 2005 14:03:06 GMT

Thatz the only alternative I can see now. Thank you for the input.


-----Original Message-----
From: Peter Gelderbloem [] 
Sent: Wednesday, September 28, 2005 17:40
Subject: RE: Issue with sounds-like queries

You should present all the alternatives to the user as well as the
of each hit in terms of country, state and full name etc. and let them
decide which one they intended. 


-----Original Message-----
From: Jayakumar.V [] 
Sent: 28 September 2005 12:03
To: Peter Gelderbloem;
Subject: RE: Issue with sounds-like queries


The reason I'm using sounds-like queries is that this search feature
will be
used by our lobby staff(s), who'll be of different nationalities. No two
users may spell the place name the same way. They may also misspell the
names. To bring out the closest match based on what they've input, I
need to
use sounds-like queries. 


-----Original Message-----
From: Peter Gelderbloem [] 
Sent: Wednesday, September 28, 2005 14:14
Subject: RE: Issue with sounds-like queries

May be you should not be using sounds like queries in the first place?
They are supposed to be fuzzy afaik.

-----Original Message-----
From: Jayakumar.V [] 
Sent: 27 September 2005 14:54
Subject: Issue with sounds-like queries



I'm facing an issue with sounds-like queries. I've experimented with
Apache Codec & the Phonetix library from Tangentum Technologies
( ) to see
if I
could sort out the issue somehow using either of the libraries. 


I've an index containing details of various Banks in the world & their
associated Branches. 

Each document has a field holding the Branch Name(s) for the individual


While searching for the following branch name :-  QUILON, it also
back details where the branch name may contain the word COLONY, since
Metaphone or DoubleMetaphone, both QUILON & COLONY get encoded to the
value :-  KLN. 

This returns in-correct results. 


Another example would be CALICUT (located in South India) & CALCUTTA
(located in North India), both get encoded to KLKT.


I can narrow down the result by filtering based on COUNTRY or COUNTRY +
STATE but still I might get back results which may not be the one


I also tried using the RefinedSoundex class. The issue here is that,
BRANCH" will get encoded as - Q50708190830, whereas "QUILON" alone will
encoded as - Q50708. The user may input only "QUILON" while making a
which will not return back hits in the above case.


Hope I was clear in communicating the issue. 


Any thoughts / inputs will be really helpful.



Thanks & Regards



UAE Xchange Center

PB.No. : 170, Abudhabi, UAE

Phone: + 971-2-6105656, 6105658

Fax: +971-2-6323775


Confidentiality Notice :  This e-mail message, including any
attachments, is
for the sole use of the intended recipient(s) and may contain
and privileged information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
contact the sender by reply e-mail and destroy all copies of the



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message