lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3357) Unit and integration test cases for the new Similarities
Date Wed, 10 Aug 2011 12:18:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082301#comment-13082301
] 

Robert Muir commented on LUCENE-3357:
-------------------------------------

{quote}
Robert: I'm on the Nan/Inf problems. As for the negative score, I'll leave it there for the
time being, these Similarities should always return positive scores. I don't feel very confident
about this test myself, so I guess I'll remove it (or at least make it optional) once all
tests are successful.
{quote}

Ahh, ok. I didn't know the sims should always return positive scores! If this is really the
case, then its good to test for it.

{quote}
As for the PreFlex codec, I must admit I am not familiar with it, so I would be grateful for
a few pointers.
{quote}

PreFlex codec emulates the Lucene 3.x index format, which does not support TotalTermFreq,
SumTotalTermFreq, SumDocFreq, etc. It will return -1 here.
Though I just realized: in some situations any codec can return -1 for these values, for example
if frequencies are omitted by the user (omitTFAP).
So currently, unfortunately, similarities have to deal with this case (and also the case where
norms == null, because norms are omitted by the user (omitNorms) !).

I've been working on the BM25 sim with all these regards, Ill commit an update to it as an
example.

> Unit and integration test cases for the new Similarities
> --------------------------------------------------------
>
>                 Key: LUCENE-3357
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3357
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: core/query/scoring
>    Affects Versions: flexscoring branch
>            Reporter: David Mark Nemeskey
>            Assignee: David Mark Nemeskey
>            Priority: Minor
>              Labels: gsoc, gsoc2011, test
>             Fix For: flexscoring branch
>
>         Attachments: LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch,
LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch, LUCENE-3357.patch
>
>
> Write test cases to test the new Similarities added in [LUCENE-3220|https://issues.apache.org/jira/browse/LUCENE-3220].
Two types of test cases will be created:
>  * unit tests, in which mock statistics are provided to the Similarities and the score
is validated against hand calculations;
>  * integration tests, in which a small collection is indexed and then searched using
the Similarities.
> Performance tests will be performed in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message