lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-...@tropo.com>
Subject URL to compare 2 Similarity's ready-- Re: Scoring benchmark evaluation. Was RE: How to proceed with Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?
Date Mon, 31 Jan 2005 19:35:00 GMT

I worked w/ Chuck to get up a test page that shows search results with 2 
versions of Similarity side by side.

URL here:

	http://www.searchmorph.com/kat/wikipedia-similarity.jsp

Weblog entry here w/ some more details:

	http://www.searchmorph.com/weblog/index.php?id=46


But briefly the page uses 2 indexes of the wikipedia.
First index is all default Lucene code, and ditto for the query parser.

The second index uses Chuck's suggestion for another similarity 
implementation, and the search results use this same similarity + the 
query parser (DistributingMultiFieldQueryParser) he has proposed.

The page lets you tune parameters to his Similarity impl so you can see 
the effect of different weights.

One test that seems to show how the new code performs better is the 
search for "russian politics" where the results on the right seem more 
relevant:

http://www.searchmorph.com/kat/wikipedia-similarity.jsp?s=russian+politics







Chuck Williams wrote:

> Dave, are you using MultiFieldQueryParser and DefaultSimilarity for the
> vanilla implementation?
> 
> It's important to know what we are comparing...
> 
> Chuck
> 
>   > -----Original Message-----
>   > From: David Spencer [mailto:dave-lucene-dev@tropo.com]
>   > Sent: Friday, January 28, 2005 3:38 PM
>   > To: Lucene Developers List
>   > Subject: Re: Scoring benchmark evaluation. Was RE: How to proceed
> with
>   > Bug 31841 - MultiSearcher problems with Similarity.docFreq() ?
>   > 
>   > Daniel Naber wrote:
>   > 
>   > > On Friday 28 January 2005 22:45, Chuck Williams wrote:
>   > >
>   > >
>   > >>The fact that is requires all terms in all
>   > >>fields is part of the problem.  Once that is addressed, another
>   > problem
>   > >>is that Lucene does not provide a good mechanis
>   > >
>   > >
>   > > That's fixed in CVS, so maybe the CVS version should be used for
> the
>   > > evaluation. I think it should be robust.
>   > 
>   > Hmmm, is it safe to assume I can build the index w/ lucene-1.4.3.jar
> but
>   >    deploy the webapp for searching w/ lucene-1.5-rc1-dev.jar?
>   > 
>   > And is the current code supposed to build with so many deprecated
>   > warnings?
>   > 
>   > - Dave
>   > 
>   > >
>   > > Regards
>   > >  Daniel
>   > >
>   > 
>   > 
>   >
> ---------------------------------------------------------------------
>   > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>   > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message