Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 9613 invoked from network); 29 Nov 2010 13:41:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Nov 2010 13:41:05 -0000 Received: (qmail 2960 invoked by uid 500); 29 Nov 2010 13:41:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 2919 invoked by uid 500); 29 Nov 2010 13:41:02 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 2910 invoked by uid 99); 29 Nov 2010 13:41:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Nov 2010 13:41:01 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Nov 2010 13:40:55 +0000 Received: by gyf1 with SMTP id 1so2408861gyf.35 for ; Mon, 29 Nov 2010 05:40:33 -0800 (PST) MIME-Version: 1.0 Received: by 10.151.143.10 with SMTP id v10mr1487591ybn.208.1291038032967; Mon, 29 Nov 2010 05:40:32 -0800 (PST) Received: by 10.150.200.5 with HTTP; Mon, 29 Nov 2010 05:40:32 -0800 (PST) In-Reply-To: References: Date: Mon, 29 Nov 2010 20:40:32 +0700 Message-ID: Subject: Re: precision and recall in lucene From: Yakob To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On 11/29/10, Erick Erickson wrote: > Define precision. Define recall. Define measure .... > > Sorry to give in to my impulses, but this question is so broad it's > unanswerable. Try looking at the Text REtrieval Conference for instance. > Lots of very bright people spend significant amounts of their careers > trying to just define what these mean. Much less how to measure them. > > And what "good" precision and recall are varies > with the search space. And the users. An academic researcher may > be willing to spend days finding the one paper out there that speaks > to a very specific question. Your average web user won't click past > the 2nd page, maybe not the 1st. > > So perhaps you can tell us what it is you want these measures > for and maybe we can come up with some answers that are actually > helpful... > > Best > Erick well when I read the ebook of "lucene in action" I came across this sentence. "Searching is the process of looking up words in an index to find documents where they appear. The quality of a search is typically described using precision and recall metrics. Recall measures how well the search system finds relevant documents, whereas precision measures how well the system filters out the irrelevant documents." I am just thinking of how to measure the precision and recall metrics in lucene? I mean I just wanted to do an analysis of precision and recall in my thesis that happened to use lucene as the framework. :-) -- http://jacobian.web.id --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org