lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "José Ramón Pérez Agüera" <jose.agu...@fdi.ucm.es>
Subject Re: TREC Collection, NIST and Lucene
Date Mon, 20 Aug 2007 17:10:13 GMT
It is perfect :-) I think, maybe would be interesting that you send a CC to
LCD, because I think that they have some kind of rights on TREC collections.

http://trec.nist.gov/data/docs_eng.html

http://www.ldc.upenn.edu/

jose

On 8/20/07, Grant Ingersoll <gsingers@apache.org> wrote:
>
> How does this sound:
>
> Dear ----,
>
> My name is Grant Ingersoll and I am committer on the Lucene Java
> search library (http://lucene.apache.org) at the Apache Software
> Foundation (ASF).  I am not, however, writing in any official
> capacity as a representative of the ASF.  Perhaps at a later date,
> this will change, but for now I just want to keep things informal.
>
> I am, however, interested in starting a discussion about how open
> source projects like Lucene could participate in future TREC
> evaluations, or at least gain access to TREC data resources.  While
> the people involved in Lucene feel we have built a top notch search
> system, one of the things the community as a whole lacks is the
> ability to do formal evaluations like TREC offers, and thus research
> and development of new algorithms is hindered.  Granted, individuals
> may perform TREC evaluations given they have purchased a license to
> the data, but the community as a whole does not have this ability.
>
> I am wondering if there is some way in which we can arrange for open
> source projects to obtain access to the TREC collections.  The
> biggest barrier for projects like Lucene, obviously, is the fee that
> needs to be paid.  Furthermore, there are undoubtedly distribution
> and copyright concerns.  Yet, a part of me feels that we can work
> something out through creative licensing or some other novel approach
> that protects the appropriate interests, furthers TREC's mission and
> supports the vibrant Open Source community around Lucene and other
> search engines.  Perhaps it would be possible to require that any
> participant who wants the TREC data must prove that they are
> appropriately affiliated with an official open source project, as
> defined by the Open Source Initiative (http://www.opensource.org).
> Many tool vendors have similar licenses that allow open source
> participants to use their tool while working on open source projects
> [1].  Perhaps we could provide a similar approach to the TREC data.
>
> I feel this would benefit TREC substantially, by providing an open,
> baseline system for all the world to see and I see that it fits very
> much with the motto of TREC  "...to encourage research in information
> retrieval from large text collections."   Naturally, it benefits
> Lucene by allowing Lucene to undertake more formal evaluation of
> relevance, etc.
>
> If you are interested in more background on this on the Lucene Java
> developers mailing list, please refer to
> http://www.gossamer-threads.com/lists/lucene/java-dev/52022?
> search_string=TREC;#52022
>
> I look forward to hearing back from you and I would be more than
> happy to answer any questions you have.
>
> Sincerely,
> Grant Ingersoll
>
> [1] JetBrains, Atlassian, Clover Test Coverage, etc.
>
> -------
>
> -Grant
>
>
>
>
>
> On Aug 10, 2007, at 4:52 AM, Tom White wrote:
>
> >> Furthermore, I think it would
> >> encourage Lucene users/developers to think about relevance as much as
> >> we think about speed.
> >
> > +1
> >
> > However I think it would be much better to start by making informal
> > approaches as you suggest - the open letter seems to me to be
> > appropriate only as a last resort.
> >
> > Tom
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
José Ramón Pérez Agüera

Dept. de Ingeniería del Software e Inteligencia Artificial
Despacho 411 tlf. 913947599
Facultad de Informática
Universidad Complutense de Madrid

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message