lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vic Bancroft <bancr...@america.net>
Subject Re: Include BM25 in Lucene?
Date Tue, 17 Oct 2006 12:44:07 GMT
J.Zhu wrote:

>If I would like to contribute, what should I do? I am not a good Java
>developer myself though. Can I work with someone also interested?
>  
>
In some of my group's usage of lucene over large document collections, 
we have split the documents across several machines.  This has lead to a 
concern of whether the inverse document frequency was appropriate, since 
the score seems to be dependant on the partioning of documents over 
indexing hosts.  We have not formulated an experiment to determine if it 
seriously effects our results, though it has been discussed.

If someone could elaborate how BM25 or some DFR algorithm would differ 
from what (TF/IDF) is implemented in lucene, I would be willing to help 
translate that into java as an indexing/searching option . . .

more,
l8r,
v


-- 
"The future is here. It's just not evenly distributed yet."
 -- William Gibson, quoted by Whitfield Diffie


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message