lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siegfried Goeschl (JIRA)" <>
Subject [jira] [Created] (SOLR-4056) Contribution of component to gather the most frequent user search request in real-time
Date Fri, 09 Nov 2012 18:06:12 GMT
Siegfried Goeschl created SOLR-4056:

             Summary: Contribution of component to gather the most frequent user search request
in real-time
                 Key: SOLR-4056
             Project: Solr
          Issue Type: New Feature
          Components: SearchComponents - other
    Affects Versions: 3.6.1
            Reporter: Siegfried Goeschl
            Priority: Minor
             Fix For: 3.6.2

I'm now finishing a SOLR project for one of my customers (replacing Microsoft FAST server
with SOLR) and got the permission to contribute our improvements.

The most interesting thing is a "FrequentSearchTerm" component which allows to analyze the
user-supplied search queries in real-time

 * it keeps track of the last queries per core using a LIFO buffer (so we have an upper limit
of memory consumption)
 * per query entry we keep track of the number of invocations, the average number of result
document and the average execution time
 * we allow for custom searches across the frequent search terms using the MVEL expression
language (see
 ** find all queries which did not yield any results - 'meanHits==0'
 ** find all "iPhone" queries - "searchTerm.contains("iphone) || searchTerm.contains("i-phone)''
 ** find all long-running "iPhone" queries - '(searchTerm.contains("iphone) || searchTerm.contains("i-phone))
&& meanTime>50'
 * GUI : we have a JSP page which allows to access the frequent search terms
 * there is also an XML/CSV export we use to display the 50 most frequently used search queries
in real-time

We use this component

 * to get input for QA regarding frequently used search terms
 * to find strange queries, e.g. queries returning no or too many result, e.g. caused by WordDelimeterFilter
 * to keep our management happy ... :-)

 Not sure if the name "Frequent Search Term Component" is perfectly suitable as it was taken
from FAST - suggestions welcome. Maybe "FrequentSearchQueryComponent" would be more suitable?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message