lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3403) Term vectors missing after addIndexes + optimize
Date Fri, 26 Aug 2011 11:00:29 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091709#comment-13091709
] 

Shai Erera commented on LUCENE-3403:
------------------------------------

You're right, it does not happen on trunk. I still want to commit the test cases to trunk
too, so that we've got that covered there as well. Therefore I think I should keep the 4.0
fix version?

The problem is that SegmentMerger receives its FieldInfos from DocumentsWriter, and it knows
whether to set hasVector according to what it receives. When you addDoc, DW has FieldInfos,
but when you only addIndexes, DW doesn't.

In fact, the field infos are read only on IW open ... so even if I addIndexes(), commit(),
addIndexes(), the field infos would still be missing. A workaround I see for now is to addIndexes(),
close(), new IW(), continue with addIndexes() or optimize(). Which is ugly but it's a workaround
until we release a new version. I'll try that.

If it's ok, I'll commit the fix to 3x and the tests-only to trunk.

> Term vectors missing after addIndexes + optimize
> ------------------------------------------------
>
>                 Key: LUCENE-3403
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3403
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 3.3
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Blocker
>             Fix For: 3.4, 4.0
>
>         Attachments: LUCENE-3403.patch
>
>
> I encountered a problem with addIndexes where term vectors disappeared following optimize().
I wrote a simple test case which demonstrates the problem. The bug appears with both addIndexes()
versions, but does not appear if addDocument is called twice, committing changes in between.
> I think I tracked the problem down to IndexWriter.mergeMiddle() -- it sets term vectors
before merger.merge() was called. In the addDocs case, merger.fieldInfos is already populated,
while in the addIndexes case it is empty, hence fieldInfos.hasVectors returns false.
> will post a patch shortly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message