lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: precision and recall in lucene
Date Mon, 29 Nov 2010 15:35:11 GMT
Well, I guess I can answer your original question with "no". There's
no Lucene method that will give you these because they aren't
defined. If you can answer the question "given a corpus and a set
of queries and the correct ordering of the relevant documents, how
close does Lucene come to that ordering?", then you could calculate
whether all the docs that should have been returned were found (recall)
and whether the documents returned contained only the documents
that should have been returned (precision).

But a lot of effort in Solr/Lucene tuning is tweaking returned results
to make precision and recall "better", where "better" is understood
relevant to a particular problem space and aren't well defined in
the abstract (and perhaps can't be).

Best
Erick

On Mon, Nov 29, 2010 at 8:40 AM, Yakob <jacobian@opensuse-id.org> wrote:

> On 11/29/10, Erick Erickson <erickerickson@gmail.com> wrote:
> > Define precision. Define recall. Define measure <G>....
> >
> > Sorry to give in to my impulses, but this question is so broad it's
> > unanswerable. Try looking at the Text REtrieval Conference for instance.
> > Lots of very bright people spend significant amounts of their careers
> > trying to just define what these mean. Much less how to measure them.
> >
> > And what "good" precision and recall are varies
> > with the search space. And the users. An academic researcher may
> > be willing to spend days finding the one paper out there that speaks
> > to a very specific question. Your average web user won't click past
> > the 2nd page, maybe not the 1st.
> >
> > So perhaps you can tell us what it is you want these measures
> > for and maybe we can come up with some answers that are actually
> > helpful...
> >
> > Best
> > Erick
>
> well when I read the ebook of "lucene in action" I came across this
> sentence.
>
> "Searching is the process of looking up words in an index to find
> documents where
> they appear. The quality of a search is typically described using precision
> and
> recall metrics. Recall measures how well the search system finds
> relevant documents,
> whereas precision measures how well the system filters out the irrelevant
> documents."
>
> I am just thinking of how to measure the precision and recall metrics
> in lucene? I mean I just wanted to do an analysis of precision and
> recall in my thesis that happened to use lucene as the framework. :-)
>
>
>
> --
> http://jacobian.web.id
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message