lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3878) CheckIndex should check deleted documents too
Date Fri, 16 Mar 2012 16:29:39 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231347#comment-13231347
] 

Robert Muir commented on LUCENE-3878:
-------------------------------------

I'm still nervous about the patch losing coverage for postings.

The problem with the previous second-pass was we only did minimal checks with a null liveDocs.

I think ideally we factor these checks into separate methods that take liveDocs, and return
stats.
if there are deletions we do the full check with the real liveDocs too, and assert stats <=
rawStats

It will be heavy but i think we can do it from this patch.
                
> CheckIndex should check deleted documents too
> ---------------------------------------------
>
>                 Key: LUCENE-3878
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3878
>             Project: Lucene - Java
>          Issue Type: Task
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-3878.patch, LUCENE-3878.patch
>
>
> In 4.0 livedocs are passed down to the enums, thus deleted docs are not so special.
> So I think checkindex should not pass the livedocs down to the enums when checking,
> it should pass livedocs=null and check all the postings. It already does this separately
to 
> collect stats i think to compare against the term/collection statistics? But we should
> just clean this up and only use one enum.
> For example LUCENE-3876 is a case where we were actually making a corrumpt index,
> (a position was negative) but because the document in question was deleted, CheckIndex

> didn't detect this.
> This could have caused problems if someone just passed null for livedocs (maybe they

> are doing something where its not so important to take deletions into account)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message