lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Samarendra Pratap <samarz...@gmail.com>
Subject First query to find meta data, second to search. How to group into one?
Date Tue, 15 May 2012 19:26:09 GMT
Hi,
 I need a suggestion for improving relevance of search results. Any
help/pointers are appreciated.

 We have following fields (plus a lot more) in our schema

title
description
category_id (multivalued)

We are using mm=70% in solrconfig.xml
We are using qf=title description
We are not doing phrase query in "q"

In case of a multi-word search text, mostly the end results are the junk
ones. Because the words, mentioned in search text, are written in different
fields and in different contexts.
For example searching for "water proof" (without double quotes) brings a
record where title = "rose water" and description = "... no proof of
contamination ..."

Our priority is to remove irrelevant results, as much as possible.
Increasing "mm" will not solve this completely because user input may not
be always correct to be benefited by high "mm".

To remove irrelevant records we worked on following solution (or
work-around)

   - We are firing first query to get top "n" results. We assume that first
   "n" results are mostly good results. "n" is dynamic within a predefined
   minimum and maximum value.
   - We are calculating frequency of category ids in these top results. We
   are not using facets because that gives count for all, relevant or
   irrelevant, results.
   - Based on category frequencies within top matching results we are
   trying to find a few most frequent categories by simple calculation. Now we
   are very confident that these categories are the ones which best suit to
   our query.
   - Finally we are firing a second query with top categories, calculated
   above, in filter query (fq).


The quality of results really increased very much so I thought to try it
the standard way.
Does it require writing a plugin if I want to move above logic into Solr?
Which component do I need to modify - QueryComponent?

Or is there any better or even equivalent method in Solr of doing this or
similar thing?



Thanks

-- 
Regards,
Samar

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message