lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zeynep P." <>
Subject Lucene 4.0 benchmark bug?
Date Wed, 17 Oct 2012 13:48:05 GMT
Hi to all,

I started to use benchmark 4.0 to create submission report files with the
following code:
        BufferedReader br = new BufferedReader(fr);
        QualityQuery qqs[] = qReader.readQueries(br);  
        QualityQueryParser qqParser = new SimpleQQParser("title", "body");  
        QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser,
searcher, "docname") ;
        SubmissionReport submitLog = new SubmissionReport(loggertest,
        QualityStats stats[] = qrun.execute(null, submitLog, null);

My index is created by lucene 3.6. I use LA Times topics 401-450. With 3.6,
no problem. However, when I use benchmark 4.0 I realised that it returns the
results only for the first query 401 which is "foreign minorities, Germany". 
When I debug the code, at SimpleQQParser, the boolean query generated is
"body:foreign" without other keywords. I go on debugging and it seems that
the problem is raised at  QueryParserBase.newFieldQuery  which returns null
for  the rest of all queries and other keywords in the same query.  I
updated the code for my adhoc use. Unless, I don't know  how to fix it or 
it also happens to someone else?!

Second problem, for the same collection MAP = 0.17 with default similarity,
MAP= 0.07 with lucene 4.0 BM25 similarity (b=0.75, k1=1.2). I got MAP = 0.14
with BM25 implemented based on
However this collection is represented in the litterature with MAP around
0.25 with BM25 scoring function. Did someone evaluate the different
similarities and can share the results? 

Best Regards,

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message