Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 362CC11231 for ; Thu, 12 Jun 2014 18:02:13 +0000 (UTC) Received: (qmail 32532 invoked by uid 500); 12 Jun 2014 18:02:11 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 32471 invoked by uid 500); 12 Jun 2014 18:02:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 32460 invoked by uid 99); 12 Jun 2014 18:02:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jun 2014 18:02:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of iorixxx@yahoo.com designates 98.138.91.249 as permitted sender) Received: from [98.138.91.249] (HELO nm27-vm5.bullet.mail.ne1.yahoo.com) (98.138.91.249) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jun 2014 18:02:07 +0000 Received: from [98.138.226.179] by nm27.bullet.mail.ne1.yahoo.com with NNFMP; 12 Jun 2014 18:01:42 -0000 Received: from [98.138.89.168] by tm14.bullet.mail.ne1.yahoo.com with NNFMP; 12 Jun 2014 18:01:42 -0000 Received: from [127.0.0.1] by omp1024.mail.ne1.yahoo.com with NNFMP; 12 Jun 2014 18:01:42 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 915566.46449.bm@omp1024.mail.ne1.yahoo.com Received: (qmail 33991 invoked by uid 60001); 12 Jun 2014 18:01:42 -0000 X-YMail-OSG: 30e5EuEVM1n_ywRPxYC9qEHkUdxH8f9be_RFvsJqKpkYRdW mqime2ZGBo3mjIPQDH.RSnZFVGqE3Vhzv8bmmz_9Iy3PvNiKSnGMkD1WiS1D b04bji.Locaem0NiVBMxveB1oP4GRwOGjCPehqgiL0q7fe4DrKC0Q_.BHt0f 7qDig5FgErOasxkFxduSwPIycKD9ktKKcx7A9rgWxoneyLgHVnEwQkecNJSV 79SnqVnV22MH9Nd6TzjtV6fnblwJmrEKBDj3ecDWU87b2V0Mp5KawFQy5c6V PCCYSBCcGswQc3phAqci52_rCsBMsw1B4J6vMsW4KFbTxthoiHpbrtNAVRJW V1U_yTmUtp89gnCWJQEy0Pn98OBxorNUlswGS6x.iGwrJdrpJHGx6YwM53PV 5jMcvyM85PUtmg3OSaS5yD7WKzpmLlRWNuw1mzzlNPb3DK1SnNXvXd7hkWEn sYiDl.UiaImNmWGJSMWHPIHnoJlBn27kqoEPigpNQb_6VAIJVifmhTBbbUHZ M67CqDc2uJ8bCFvI6XrV89kDAVCibAMA2j2hq2tXNhOMYp7hg3rKANDqYoss emnFfyr8386qB1gNfsybDNIXzs9YtMbaNKcLrwUSIz3sDyd2CTzv27U9X2al 2Vnw1y_yIiFE7LqP714H8TM9aHGoTyu8QCFYBC.aEaeHuZ3rpktyfJ_sHs_l YmvI6U2_bepWcr9gGaU9pS6RfkRS.omYeJlQmUJOy9b8- Received: from [78.167.57.194] by web124703.mail.ne1.yahoo.com via HTTP; Thu, 12 Jun 2014 11:01:42 PDT X-Rocket-MIMEInfo: 002.001,SGksCgpSZWxldmFuY2UgSnVkZ21lbnRzIGFyZSBsYWJvciBpbnRlbnNpdmUgYW5kIGV4cGVuc2l2ZS4gU29tZSBJbmZvcm1hdGlvbiBSZXRyaWV2YWwgZm9ydW1zICggVFJFQywgQ0xFRiwgZXRjKSBwcm92aWRlIHRoZXNlIGdvbGRlbiBzZXRzLiBCdXQgdGhleSBhcmUgbm90IHB1YmxpYy4KCmh0dHA6Ly9yb3NlbmZlbGRtZWRpYS5jb20vYm9va3Mvc2VhcmNoLWFuYWx5dGljcy8gdGFsa3MgYWJvdXQgaG93IHRvIGNyZWF0ZSBhICJnb2xkZW4gc2V0IiBmb3IgeW91ciB0b3AgbiBxdWVyaWVzLgoKCkFsc28gdGgBMAEBAQE- X-Mailer: YahooMailWebService/0.8.190.668 References: Message-ID: <1402596102.28045.YahooMailNeo@web124703.mail.ne1.yahoo.com> Date: Thu, 12 Jun 2014 11:01:42 -0700 (PDT) From: Ahmet Arslan Reply-To: Ahmet Arslan Subject: Re: Relevancy tests To: "java-user@lucene.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Hi, Relevance Judgments are labor intensive and expensive. Some Information Retrieval forums ( TREC, CLEF, etc) provide these golden sets. But they are not public. http://rosenfeldmedia.com/books/search-analytics/ talks about how to create a "golden set" for your top n queries. Also there are some works describing how to tune parameters of search system using click trough data. On Thursday, June 12, 2014 8:47 PM, Ivan Brusic wrote: Perhaps more of an NLP question, but are there any tests regarding relevance for Lucene? Given an example corpus of documents, what are the golden sets for specific queries? The Wikidump dump is used as a benchmarking tool for both indexing and querying in Lucene, but there are no metrics in terms of precision. The Open Relevance project was closed yesterday ( http://lucene.apache.org/openrelevance/), which is what prompted me to ask this question. Was the sub-project closed because others have found alternate solutions? Relevancy is of course extremely context-dependent and objective, but my hope is that there is an example catalog somewhere with defined golden sets. Cheers, Ivan --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org