lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michele Amoretti" <>
Subject Re: simple (?) question about scoring
Date Fri, 03 Nov 2006 06:07:30 GMT
I have a question: is the score for a document different if I have
only that document in my index, or if I have N documents?
If the answer is yes, I will put all N documents together, otherwise I
will evaluate them one by one.

Btw, I will ask the ws develepoer about how queries are interpreted by
the search engine.


On 11/3/06, Chris Hostetter <> wrote:
> : le list is not ordered (I do not know the details of the search
> : angine, I only have its result for a query)
> :
> : then I have this list of documents, which represents a subset of the corpus
> :
> : I have to rank the documents of the list, using your scoring algorithm
> In other words, out of a large copus C, this webservice hase
> told you that the documents comprising subset S is the top N matching
> documents for your query Q (where N << sizeof(C))
> your goal is to sort S as best as possible.
> You could try indexing all the docs in S in a Lucene RAMDirectory and then
> search on them, but my orriginal point about the score being
> fairly meaningless in an index of only 1 document still applies somewhat
> ... if all of the documents you get back allready have a lot in common
> 9they must have something in common or hte webservice wouldn't havre
> returned them in response to your query) it may be hard to get a
> meaningful document frequency of the words in your query.
> you also may run into confusion about what exactly your query "is" and
> wether or not your interpretation matches that of the webservice ... at a
> very simplisitc level, if the query is "Java Lucene" and your webservice
> interprets that as an "OR" query and you interpret that as an "AND" query,
> you might find that the scores you compute for all the docs are 0.
> If i were in your shoes, i'd try to work with whoever runs this webservice
> to make it return more usefull informatoin -- at the very least to return
> results in sorted order.
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Michele Amoretti, Ph.D.
Distributed Systems Group
Dipartimento di Ingegneria dell'Informazione
Università degli Studi di Parma

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message