lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: simple (?) question about scoring
Date Thu, 02 Nov 2006 21:20:23 GMT

: > .. Btw, I do not have an index, I have 1 Document, and 1 Query.

: Lucene scoring - - uses
: pre-computed statistics, location info, and the number of documents in the
: index (1 in your case). So some preparation is required before a
: (stand-alone) document can be scored against a query.

Doron's comments really just scratch the surface of a larger issue with
your question: Lucene is not an API for evaluating how similar a
"Document" is to a "Query", it's for finding Documents in a Corpus which
match a Query, and (optionally) using the "Score" to know which Documnts
match better then other docuemnts.

For most of the various types of Queries that exist in Lucene, the score
is very dependent on how common the Terms involved are in the Corpus as a
whole -- if your Corpus consists of only 1 Document, then your scores are
going to be relatively meaningless.

Perhaps what you are interested in is more of an substring matching count?
or an Edit Distance type calculation? ... can you give us a concrete
example of what type of "score" you are looking for and what you mean when
you say "Query" ?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message