lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jayakumar.V" <jayakuma...@uaeexchange.com>
Subject RE: Issue with sounds-like queries
Date Wed, 28 Sep 2005 14:03:06 GMT

Thatz the only alternative I can see now. Thank you for the input.


Jayakumar.V 

-----Original Message-----
From: Peter Gelderbloem [mailto:Peter.Gelderbloem@mediasurface.com] 
Sent: Wednesday, September 28, 2005 17:40
To: jayakumar.v@uaeexchange.com
Subject: RE: Issue with sounds-like queries


You should present all the alternatives to the user as well as the
contexts 
of each hit in terms of country, state and full name etc. and let them
decide which one they intended. 

Peter 

-----Original Message-----
From: Jayakumar.V [mailto:jayakumar.v@uaeexchange.com] 
Sent: 28 September 2005 12:03
To: Peter Gelderbloem; java-user@lucene.apache.org
Subject: RE: Issue with sounds-like queries

Hi,

The reason I'm using sounds-like queries is that this search feature
will be
used by our lobby staff(s), who'll be of different nationalities. No two
users may spell the place name the same way. They may also misspell the
names. To bring out the closest match based on what they've input, I
need to
use sounds-like queries. 

--
Jayakumar.V 


-----Original Message-----
From: Peter Gelderbloem [mailto:Peter.Gelderbloem@mediasurface.com] 
Sent: Wednesday, September 28, 2005 14:14
To: java-user@lucene.apache.org; jayakumar.v@uaeexchange.com
Subject: RE: Issue with sounds-like queries

May be you should not be using sounds like queries in the first place?
They are supposed to be fuzzy afaik.

-----Original Message-----
From: Jayakumar.V [mailto:jayakumar.v@uaeexchange.com] 
Sent: 27 September 2005 14:54
To: java-user@lucene.apache.org
Subject: Issue with sounds-like queries

Hi,

 

I'm facing an issue with sounds-like queries. I've experimented with
both
Apache Codec & the Phonetix library from Tangentum Technologies
(http://www.tangentum.biz/en/products/phonetix/faqs/index.html ) to see
if I
could sort out the issue somehow using either of the libraries. 

 

I've an index containing details of various Banks in the world & their
associated Branches. 

Each document has a field holding the Branch Name(s) for the individual
Bank(s). 

 

While searching for the following branch name :-  QUILON, it also
returns
back details where the branch name may contain the word COLONY, since
using
Metaphone or DoubleMetaphone, both QUILON & COLONY get encoded to the
same
value :-  KLN. 

This returns in-correct results. 

 

Another example would be CALICUT (located in South India) & CALCUTTA
(located in North India), both get encoded to KLKT.

 

I can narrow down the result by filtering based on COUNTRY or COUNTRY +
STATE but still I might get back results which may not be the one
intended.

 

I also tried using the RefinedSoundex class. The issue here is that,
"QUILON
BRANCH" will get encoded as - Q50708190830, whereas "QUILON" alone will
get
encoded as - Q50708. The user may input only "QUILON" while making a
search
which will not return back hits in the above case.

 

Hope I was clear in communicating the issue. 

 

Any thoughts / inputs will be really helpful.

 

 

Thanks & Regards

 

Jayakumar.V

UAE Xchange Center

PB.No. : 170, Abudhabi, UAE

Phone: + 971-2-6105656, 6105658

Fax: +971-2-6323775

  _____  

Confidentiality Notice :  This e-mail message, including any
attachments, is
for the sole use of the intended recipient(s) and may contain
confidential
and privileged information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please
contact the sender by reply e-mail and destroy all copies of the
original
message.

  _____  

 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message