lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3357) Unit and integration test cases for the new Similarities
Date Wed, 10 Aug 2011 12:18:27 GMT


Robert Muir commented on LUCENE-3357:

Robert: I'm on the Nan/Inf problems. As for the negative score, I'll leave it there for the
time being, these Similarities should always return positive scores. I don't feel very confident
about this test myself, so I guess I'll remove it (or at least make it optional) once all
tests are successful.

Ahh, ok. I didn't know the sims should always return positive scores! If this is really the
case, then its good to test for it.

As for the PreFlex codec, I must admit I am not familiar with it, so I would be grateful for
a few pointers.

PreFlex codec emulates the Lucene 3.x index format, which does not support TotalTermFreq,
SumTotalTermFreq, SumDocFreq, etc. It will return -1 here.
Though I just realized: in some situations any codec can return -1 for these values, for example
if frequencies are omitted by the user (omitTFAP).
So currently, unfortunately, similarities have to deal with this case (and also the case where
norms == null, because norms are omitted by the user (omitNorms) !).

I've been working on the BM25 sim with all these regards, Ill commit an update to it as an

> Unit and integration test cases for the new Similarities
> --------------------------------------------------------
>                 Key: LUCENE-3357
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: core/query/scoring
>    Affects Versions: flexscoring branch
>            Reporter: David Mark Nemeskey
>            Assignee: David Mark Nemeskey
>            Priority: Minor
>              Labels: gsoc, gsoc2011, test
>             Fix For: flexscoring branch
>         Attachments: LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch,
LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch
> Write test cases to test the new Similarities added in [LUCENE-3220|].
Two types of test cases will be created:
>  * unit tests, in which mock statistics are provided to the Similarities and the score
is validated against hand calculations;
>  * integration tests, in which a small collection is indexed and then searched using
the Similarities.
> Performance tests will be performed in a separate issue.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message