lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "José Ramón Pérez Agüera" <>
Subject Re: TREC Collection, NIST and Lucene
Date Mon, 20 Aug 2007 17:10:13 GMT
It is perfect :-) I think, maybe would be interesting that you send a CC to
LCD, because I think that they have some kind of rights on TREC collections.


On 8/20/07, Grant Ingersoll <> wrote:
> How does this sound:
> Dear ----,
> My name is Grant Ingersoll and I am committer on the Lucene Java
> search library ( at the Apache Software
> Foundation (ASF).  I am not, however, writing in any official
> capacity as a representative of the ASF.  Perhaps at a later date,
> this will change, but for now I just want to keep things informal.
> I am, however, interested in starting a discussion about how open
> source projects like Lucene could participate in future TREC
> evaluations, or at least gain access to TREC data resources.  While
> the people involved in Lucene feel we have built a top notch search
> system, one of the things the community as a whole lacks is the
> ability to do formal evaluations like TREC offers, and thus research
> and development of new algorithms is hindered.  Granted, individuals
> may perform TREC evaluations given they have purchased a license to
> the data, but the community as a whole does not have this ability.
> I am wondering if there is some way in which we can arrange for open
> source projects to obtain access to the TREC collections.  The
> biggest barrier for projects like Lucene, obviously, is the fee that
> needs to be paid.  Furthermore, there are undoubtedly distribution
> and copyright concerns.  Yet, a part of me feels that we can work
> something out through creative licensing or some other novel approach
> that protects the appropriate interests, furthers TREC's mission and
> supports the vibrant Open Source community around Lucene and other
> search engines.  Perhaps it would be possible to require that any
> participant who wants the TREC data must prove that they are
> appropriately affiliated with an official open source project, as
> defined by the Open Source Initiative (
> Many tool vendors have similar licenses that allow open source
> participants to use their tool while working on open source projects
> [1].  Perhaps we could provide a similar approach to the TREC data.
> I feel this would benefit TREC substantially, by providing an open,
> baseline system for all the world to see and I see that it fits very
> much with the motto of TREC  " encourage research in information
> retrieval from large text collections."   Naturally, it benefits
> Lucene by allowing Lucene to undertake more formal evaluation of
> relevance, etc.
> If you are interested in more background on this on the Lucene Java
> developers mailing list, please refer to
> search_string=TREC;#52022
> I look forward to hearing back from you and I would be more than
> happy to answer any questions you have.
> Sincerely,
> Grant Ingersoll
> [1] JetBrains, Atlassian, Clover Test Coverage, etc.
> -------
> -Grant
> On Aug 10, 2007, at 4:52 AM, Tom White wrote:
> >> Furthermore, I think it would
> >> encourage Lucene users/developers to think about relevance as much as
> >> we think about speed.
> >
> > +1
> >
> > However I think it would be much better to start by making informal
> > approaches as you suggest - the open letter seems to me to be
> > appropriate only as a last resort.
> >
> > Tom
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > For additional commands, e-mail:
> >
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message