lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (Commented) (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3596) DirectoryTaxonomyWriter extensions should be able to set internal index writer config attributes such as info stream
Date Sun, 27 Nov 2011 09:54:40 GMT


Doron Cohen commented on LUCENE-3596:

Also, there seems to be a bug in current taxonomy writer test - TestIndexClose - where the
IndexWriterConfig's merge policy might allow to merge segments out-of-order. That test calls
LTC.newIndexWriterConfig() and it is just by luck that this test have not failed so far.

This is a bad type of failure for an application (is there ever a good type?;)), because by
the time the bug is exposed it would show as a wrong facet returned in faceted search, and
go figure that late that this is because at an earlier time an index writer was created which
allowed out-of-order merging...

Therefore, it would have been useful if, in addition to the javadocs about requiring type
of merge policy, we would also throw an exception (IllegalArgument or IO) if the IWC's merge
policy allows merging out-of-order. This should be checked in two locations: 
- createIWC() returns
- openIndex() returns, by examining the IWC of the index

The second check is more involved as it is done after the index was already opened, so it
must be closed prior to throwing that exception.

However, merge-policy does not have in its "contract" anything like Collector.acceptsDocsOutOfOrder(),
so it is not possible to verify this at all.

Adding such a method to MergePolicy seems to me an over-kill, for this particular case, unless
there is additional interest in such a declaration?

Otherwise, it is possible to require that the merge policy must be a descendant of LogMergePolicy.
This on the other hand would not allow to test this class with other order-preserving policies,
such as NoMerge.

So I am not sure what is the best  way to proceed in this regard.

I think there are two options actually:
# just javadoc that fact, and fix the test to always create an order preserving MP.
# add that declaration to MP.

Unless there are opinions favoring the second option I'll go with the first one.

In addition, (this is true for both options) I will move the call to createIWC into the constructor
and modify openIndex signature to accept an IWC instead of the open mode, as it seems wrong
- API wise - that one extension point (createIWC) is invoked by another extension point (openIndex)
- better have them both be invoked from the constructor, making it harder for someone to,
by mistake, totally ignore in createIndex() the value returned by createIWC().
> DirectoryTaxonomyWriter extensions should be able to set internal index writer config
attributes such as info stream
> --------------------------------------------------------------------------------------------------------------------
>                 Key: LUCENE-3596
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>         Attachments: LUCENE-3596.patch
> Current protected openIndexWriter(Directory directory, OpenMode openMode) does not provide
access to the IWC it creates.
> So extensions must reimplement this method completely in order to set e.f. info stream
for the internal index writer.
> This came up in [user question: Taxonomy indexer debug |]

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message