lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Filtering query results based on relevance/acuracy
Date Tue, 22 Sep 2009 12:16:48 GMT
Alex,

If I understand you correctly, all you have to do is either make sure that query is run as
a phrase query (with quotes around the it), or as a term query where both terms are required
(with plus sign in front of each term, no space).


As for detecting score gap and such, you could do that with a custom Collector.

Otis --
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Alex <azlist1@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Monday, September 21, 2009 6:17:53 PM
> Subject: Filtering query results based on relevance/acuracy
> 
> Hi,
> 
> I'm, a total newbie with lucene and trying to understand how to achieve my
> (complicated) goals. So what I'm doing is yet totally experimental for me
> but is probably extremely trivial for the experts in this list :)
> 
> I use lucene and Hibernate Search to index locations by their name, type,
> etc ...
> The LocationType is an Object that has it's "name" field indexed both
> tokenized and untokenized.
> 
> The following LocationType names are indexed
> "Restaurant"
> "Mexican Restaurant"
> "Chinese Restaurant"
> "Greek Restaurant"
> etc...
> 
> Considering the following query  :
> 
> "Mexican Restaurant"
> 
> I systematically get all the entries as a result, most certainly because the
> "Restaurant" keyword is present in all of them.
> I'm trying to have a finer grained result set.
> Obviously for "Mexican Restaurant" I want the "Mexican Restaurant" entry as
> a result but NOT "Chinese Restaurant" nor "Greek Restaurant" as they are
> irrelevant. But maybe "Restaurant" itself should be returned with a lower
> wight/score or maybe it shouldn't ... im not sure about this one.
> 
> 1)
> How can I do that ?
> 
> Here is the code I use for querying :
> 
> 
> String[] typeFields = {"name", "tokenized_name"};
>         MapboostPerField = new HashMap(2);
>         boostPerField.put( "name", (float) 4);
>         boostPerField.put( "tokenized_name", (float) 2);
> 
> 
>         QueryParser parser = new MultiFieldQueryParser(
>                 typeFields ,
>                 new StandardAnalyzer(),
>                 boostPerField
>                 );
> 
>         org.apache.lucene.search.Query luceneQuery;
> 
>         try {
>             luceneQuery = parser.parse(queryString);
>         }
>         catch (ParseException e) {
>             throw new RuntimeException("Unable to parse query: " +
> queryString, e);
>         }
> 
> 
> 
> 
> 
> I guess that there is a way to filter out results that have a score below a
> given threshold or a way to filter out results based on score gap or
> anything similar. But I have no idea on how to do this...
> 
> 
> What is the best way to achieve what I want?
> 
> Thank you for your help !
> 
> Cheers,
> 
> Alex


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message