Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 49453 invoked from network); 20 Aug 2007 17:19:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Aug 2007 17:19:28 -0000 Received: (qmail 24857 invoked by uid 500); 20 Aug 2007 17:19:22 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 24819 invoked by uid 500); 20 Aug 2007 17:19:22 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Delivered-To: moderator for java-dev@lucene.apache.org Received: (qmail 11370 invoked by uid 99); 20 Aug 2007 17:10:45 -0000 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jose.aguera@gmail.com designates 64.233.166.178 as permitted sender) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=tRM/omC2ELeBmbwvDIuQCNSARp9uKdDSYvybqCJwAu/A8QAhvvtKqucKvxM/vOYRhCb9JYp8I7biFzxJ4i85TRsdQCd3tBE3EDRXV5IIagxJ9q4tnZOkaicx8VYbIvHB2YUOJg1jCBELLgaQ539JIi8urEfdExB5L7lVtlIXQVg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=RmfKR37YfeWqVPh7OQgt91P3yNofgzjdRG/lhl0UiMIE+vI1nDI1P7xJcYttcB6YUTvL40g4jMbYc7JedJ1ViqCSCoIdxirqHKCPYtMn7ZiFdxYvSmOiYh3KSFXTrlvokczKxB4wdpBVPo7r744rfExYzcQVBr7diEHtqMUmZKg= Message-ID: <9ab5344d0708201010x278df005te191aae81f35bf9d@mail.gmail.com> Date: Mon, 20 Aug 2007 19:10:13 +0200 From: "=?ISO-8859-1?Q?Jos=E9_Ram=F3n_P=E9rez_Ag=FCera?=" Sender: jose.aguera@gmail.com To: java-dev@lucene.apache.org Subject: Re: TREC Collection, NIST and Lucene In-Reply-To: <25ABEFB1-F16B-417B-A7AA-28EA23DD8CF1@apache.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_117590_24033748.1187629813248" References: <0D1F555A-E5FC-43B4-A4D3-825B97AAF5CE@apache.org> <25ABEFB1-F16B-417B-A7AA-28EA23DD8CF1@apache.org> X-Google-Sender-Auth: d7be83e422b3b14c X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_117590_24033748.1187629813248 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline It is perfect :-) I think, maybe would be interesting that you send a CC to LCD, because I think that they have some kind of rights on TREC collections= . http://trec.nist.gov/data/docs_eng.html http://www.ldc.upenn.edu/ jose On 8/20/07, Grant Ingersoll wrote: > > How does this sound: > > Dear ----, > > My name is Grant Ingersoll and I am committer on the Lucene Java > search library (http://lucene.apache.org) at the Apache Software > Foundation (ASF). I am not, however, writing in any official > capacity as a representative of the ASF. Perhaps at a later date, > this will change, but for now I just want to keep things informal. > > I am, however, interested in starting a discussion about how open > source projects like Lucene could participate in future TREC > evaluations, or at least gain access to TREC data resources. While > the people involved in Lucene feel we have built a top notch search > system, one of the things the community as a whole lacks is the > ability to do formal evaluations like TREC offers, and thus research > and development of new algorithms is hindered. Granted, individuals > may perform TREC evaluations given they have purchased a license to > the data, but the community as a whole does not have this ability. > > I am wondering if there is some way in which we can arrange for open > source projects to obtain access to the TREC collections. The > biggest barrier for projects like Lucene, obviously, is the fee that > needs to be paid. Furthermore, there are undoubtedly distribution > and copyright concerns. Yet, a part of me feels that we can work > something out through creative licensing or some other novel approach > that protects the appropriate interests, furthers TREC's mission and > supports the vibrant Open Source community around Lucene and other > search engines. Perhaps it would be possible to require that any > participant who wants the TREC data must prove that they are > appropriately affiliated with an official open source project, as > defined by the Open Source Initiative (http://www.opensource.org). > Many tool vendors have similar licenses that allow open source > participants to use their tool while working on open source projects > [1]. Perhaps we could provide a similar approach to the TREC data. > > I feel this would benefit TREC substantially, by providing an open, > baseline system for all the world to see and I see that it fits very > much with the motto of TREC "...to encourage research in information > retrieval from large text collections." Naturally, it benefits > Lucene by allowing Lucene to undertake more formal evaluation of > relevance, etc. > > If you are interested in more background on this on the Lucene Java > developers mailing list, please refer to > http://www.gossamer-threads.com/lists/lucene/java-dev/52022? > search_string=3DTREC;#52022 > > I look forward to hearing back from you and I would be more than > happy to answer any questions you have. > > Sincerely, > Grant Ingersoll > > [1] JetBrains, Atlassian, Clover Test Coverage, etc. > > ------- > > -Grant > > > > > > On Aug 10, 2007, at 4:52 AM, Tom White wrote: > > >> Furthermore, I think it would > >> encourage Lucene users/developers to think about relevance as much as > >> we think about speed. > > > > +1 > > > > However I think it would be much better to start by making informal > > approaches as you suggest - the open letter seems to me to be > > appropriate only as a last resort. > > > > Tom > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-dev-help@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > > --=20 Jos=E9 Ram=F3n P=E9rez Ag=FCera Dept. de Ingenier=EDa del Software e Inteligencia Artificial Despacho 411 tlf. 913947599 Facultad de Inform=E1tica Universidad Complutense de Madrid ------=_Part_117590_24033748.1187629813248--