lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Lucene's Mean Average Precision
Date Mon, 05 May 2008 00:02:00 GMT

On May 4, 2008, at 7:28 PM, DanaWhite wrote:

>
> I arrived at this MAP by modifying IndexFiles to use a StopAnalyzer  
> and work
> in a way that was acceptable for TReC files.  The SearchFiles was  
> modified
> to use a StopAnalyzer and output data in a trec_eval suitable format.
> Trec_eval reports about 11% at this setting.
>
> I am not competing in TReC I am just doing an evaluation of  
> different search
> engines.
>
> At this point I am not going to add anything to Lucene to get a  
> higher MAP
> because I am trying to get a feel for its "out of the box"  
> performance.
>

It's kind of tough to say what an "out of the box" experience is in  
Lucene, so I frankly wouldn't read to much into any numbers you arrive  
at on TREC.   For instance, it is curious that you chose the  
StopAnalyzer over the more "out of the box" StandardAnalyzer.  If  
anything were out of the box, I guess it would be, given the name, the  
StandardAnalyzer, but that isn't too say it will do any better, I  
haven't tried it.  Most studies, have also shown that stemming is  
beneficial, but neither of those analyzers offer stemming.  Remember,  
Lucene really is just the canvas, paint and the brushes, it's up to  
you to do the actual painting.

Just my advice, make sure you are comparing apples to apples, or at  
least as close as you can reasonably get.  I think you will find that  
Lucene stacks up quite well.

Cheers,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message