lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <>
Subject Re: TREC Collection, NIST and Lucene
Date Mon, 20 Aug 2007 19:01:43 GMT

I like it too. And I'm wondering what the response to this will be -- it will 
in a way show if TREC really stands up to their mission, won't it?


Grant Ingersoll wrote:
> How does this sound:
> Dear ----,
> My name is Grant Ingersoll and I am committer on the Lucene Java search 
> library ( at the Apache Software Foundation 
> (ASF).  I am not, however, writing in any official capacity as a 
> representative of the ASF.  Perhaps at a later date, this will change, 
> but for now I just want to keep things informal.
> I am, however, interested in starting a discussion about how open source 
> projects like Lucene could participate in future TREC evaluations, or at 
> least gain access to TREC data resources.  While the people involved in 
> Lucene feel we have built a top notch search system, one of the things 
> the community as a whole lacks is the ability to do formal evaluations 
> like TREC offers, and thus research and development of new algorithms is 
> hindered.  Granted, individuals may perform TREC evaluations given they 
> have purchased a license to the data, but the community as a whole does 
> not have this ability.
> I am wondering if there is some way in which we can arrange for open 
> source projects to obtain access to the TREC collections.  The biggest 
> barrier for projects like Lucene, obviously, is the fee that needs to be 
> paid.  Furthermore, there are undoubtedly distribution and copyright 
> concerns.  Yet, a part of me feels that we can work something out 
> through creative licensing or some other novel approach that protects 
> the appropriate interests, furthers TREC's mission and supports the 
> vibrant Open Source community around Lucene and other search engines.  
> Perhaps it would be possible to require that any participant who wants 
> the TREC data must prove that they are appropriately affiliated with an 
> official open source project, as defined by the Open Source Initiative 
> (  Many tool vendors have similar licenses 
> that allow open source participants to use their tool while working on 
> open source projects[1].  Perhaps we could provide a similar approach to 
> the TREC data.
> I feel this would benefit TREC substantially, by providing an open, 
> baseline system for all the world to see and I see that it fits very 
> much with the motto of TREC  " encourage research in information 
> retrieval from large text collections."   Naturally, it benefits Lucene 
> by allowing Lucene to undertake more formal evaluation of relevance, etc.
> If you are interested in more background on this on the Lucene Java 
> developers mailing list, please refer to

> I look forward to hearing back from you and I would be more than happy 
> to answer any questions you have.
> Sincerely,
> Grant Ingersoll
> [1] JetBrains, Atlassian, Clover Test Coverage, etc.
> -------
> -Grant
> On Aug 10, 2007, at 4:52 AM, Tom White wrote:
>>> Furthermore, I think it would
>>> encourage Lucene users/developers to think about relevance as much as
>>> we think about speed.
>> +1
>> However I think it would be much better to start by making informal
>> approaches as you suggest - the open letter seems to me to be
>> appropriate only as a last resort.
>> Tom
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message