lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex <azli...@gmail.com>
Subject Re: Filtering query results based on relevance/acuracy
Date Sat, 26 Sep 2009 21:22:04 GMT
Hi Otis and thank your for helping me out.

Sorry for the late reply.



Although a Phrase query or TermQuery  would be perfectly suited in some
cases, this will not work in my case.

Basically my application's search feature is a single field "à la Google"
and the user can be looking for a lot of different things...

For example the user can search for
"Chinese Restaurant in New York USA"
or maybe just
"Chinese Restaurant"  (which should be understood as "nearby Chinese
Restaurant"
or maybe
"Chinese Retaurant at 12 Main St. New York"
or
"1223 Main Street New York"



So basically I will get many different query structures depending on the
user's intent/meaning/logic and I think I need to figure out a good analysis
algorithm to get Locations as acurately as possible.

As a first step in my algo I am trying to isolate/identify a potential
LocationType from the query string.
So my idea was to separate each words and use them to query my Index for
LocationTypes that would best match what's included in the query.
I could then get the best matching LocationTypes based on how it scored
against the luicene query and then move on to the next step of my algo which
would try to find another potential feature of the query such as the
presence of a Country name or City name etc ....

That's why a phrase query would not be appropriate here as this would mean
that the entire query string would be used and would most of the times
return no relevant LocationTypes.

Once I have analysed the query string and isolated the various features
(LocationType, City Name, Country Name , Address .... ) I could maybe create
a Boolean Query where I would use all that was fetched earlier


So basically I'm not sure what feature of Lucene I should use here in the
first step of the algo to only find the most relevant LocationTypes and
filter out the ones that are not relevant enough.


Any help and any thoughts on my approach greatly appreciated.


Thanks in advance.

Cheers,

Alex.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message