Alex,
If I understand you correctly, all you have to do is either make sure that query is run as
a phrase query (with quotes around the it), or as a term query where both terms are required
(with plus sign in front of each term, no space).
As for detecting score gap and such, you could do that with a custom Collector.
Otis --
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
----- Original Message ----
> From: Alex <azlist1@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Monday, September 21, 2009 6:17:53 PM
> Subject: Filtering query results based on relevance/acuracy
>
> Hi,
>
> I'm, a total newbie with lucene and trying to understand how to achieve my
> (complicated) goals. So what I'm doing is yet totally experimental for me
> but is probably extremely trivial for the experts in this list :)
>
> I use lucene and Hibernate Search to index locations by their name, type,
> etc ...
> The LocationType is an Object that has it's "name" field indexed both
> tokenized and untokenized.
>
> The following LocationType names are indexed
> "Restaurant"
> "Mexican Restaurant"
> "Chinese Restaurant"
> "Greek Restaurant"
> etc...
>
> Considering the following query :
>
> "Mexican Restaurant"
>
> I systematically get all the entries as a result, most certainly because the
> "Restaurant" keyword is present in all of them.
> I'm trying to have a finer grained result set.
> Obviously for "Mexican Restaurant" I want the "Mexican Restaurant" entry as
> a result but NOT "Chinese Restaurant" nor "Greek Restaurant" as they are
> irrelevant. But maybe "Restaurant" itself should be returned with a lower
> wight/score or maybe it shouldn't ... im not sure about this one.
>
> 1)
> How can I do that ?
>
> Here is the code I use for querying :
>
>
> String[] typeFields = {"name", "tokenized_name"};
> MapboostPerField = new HashMap(2);
> boostPerField.put( "name", (float) 4);
> boostPerField.put( "tokenized_name", (float) 2);
>
>
> QueryParser parser = new MultiFieldQueryParser(
> typeFields ,
> new StandardAnalyzer(),
> boostPerField
> );
>
> org.apache.lucene.search.Query luceneQuery;
>
> try {
> luceneQuery = parser.parse(queryString);
> }
> catch (ParseException e) {
> throw new RuntimeException("Unable to parse query: " +
> queryString, e);
> }
>
>
>
>
>
> I guess that there is a way to filter out results that have a score below a
> given threshold or a way to filter out results based on score gap or
> anything similar. But I have no idea on how to do this...
>
>
> What is the best way to achieve what I want?
>
> Thank you for your help !
>
> Cheers,
>
> Alex
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|