lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neil O. Rouben" <kno...@gmail.com>
Subject Contributing Query Expansion Module
Date Wed, 11 May 2005 11:50:24 GMT
I would like to contribute a module that performs Query Expansion (QE) in 
Lucene. Please let me know how I may go about doing that. For more details 
about module please see http://lucene-qe.sourceforge.net 

 I implemented Rocchio Query Expansion method. Terms for expansion could be 
acquired in local document repository or by using www through the use of 
Google's Web API. 

Query Expansion - Adding search terms to a user's search. Query expansion is 
the process of a search engine adding search terms to a user's weighted 
search. The intent is to improve precision and/or recall. The additional 
terms may be taken from a thesaurus. For example a search for "car" may be 
expanded to: car cars auto autos automobile automobiles
[foldoc.org<http://foldoc.org>
].

*Performance*

Experiments were conducted on the data from TREC 2004 Robust Track (
trec.nist.gov <http://trec.nist.gov>).

Note: This data is provided for reference purposes only. Better performance 
on the specific data set may not necessary be repeated on the different data 
sets, etc... 
 Tag Combined Topic Set  MAP P10 %no  Lucene QE 0.2433
0.3936
18.10%
 Lucene gQE 0.2332 0.3984 14%  KB-R-FIS gQE 0.2322 0.4076 14%  Lucene 0.2 
0.37 15%  *Tested on data from NIST TREC Robust Retrieval Track 2004 (
trec.nist.gov <http://trec.nist.gov>) *

*MAP *- mean average precision
*P10* - average of precision at 10 documents retrieved 
*%no* - percentage of topics with no relevant in the top 10 retrieved

*Lucene *- unmodified version 1.4.3
*Lucene QE *- Lucene with local query expansion 
*Lucene gQE *– Lucene system that utilized Rocchio's query expansion along 
with Google.
*KB-R-FIS gQE* – My Fuzzy Inference System that utilized Rocchio's query 
expansion along with Google.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message