lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-847) Factor merge policy out of IndexWriter
Date Sun, 25 Mar 2007 12:48:32 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12483931
] 

Michael McCandless commented on LUCENE-847:
-------------------------------------------

OK some specific comments, only on the refactoring (ie I haven't
really looked at the new merge policies yet):

  * I think maxBufferedDocs should not be exposed in any *MergePolicy
    classes or interfaces?  I'm planning on deprecating this param
    with LUCENE-843 when we switch by default to "buffering by RAM
    usage" and it really relates to "how/when should writer flush its
    RAM buffer".

  * I also think "minMergeDocs" (which today is one and the same as
    maxBufferedDocs in IndexWriter but conceptually could be a
    different configuration) also should not appear in the MergePolicy
    interface.  I think it should only appear in
    LogarithmicMergePolicy?

    If we remove these from the MergePolicy interface then maybe we
    don't need MergePolicyBase?  (Just to makes things simpler).

  * I think we should not create a LegacyMergePolicy interface.  But I
    realize you need this so the deprecated methods in IndexWriter
    (setMergeFactor, setMaxBufferedDocs, setMaxMergeDocs, etc.) can be
    implemented.  How about instead these methods will only work if
    the current merge policy is the LogarithmicMergePolicy?  You can
    check if the current mergePolicy is an instanceof
    LogarithmicMergePolicy and then throw eg an IllegalStateException
    if it's not?

    Ie, going forward, with new and interesting merge policies,
    developers should interact with their merge policy to configure
    it.

  * I was a little spooked by this change to TestAddIndexesNoOptimize:

      -    assertEquals(2, writer.getSegmentCount());
      +    assertEquals(3, writer.getSegmentCount());

    I think with just the refactoring, there should not need to be any
    changes to unit tests right?

  * It's interesting that you've pulled "useCompoundFile" into the
    LegacyMergePolicy.  I'm torn on whether it belongs in MergePolicy
    at all, since this is really a file format issue?

    For example, newly written segments (no longer a "merge" with
    LUCENE-843) must also know whether to write in compound file
    format.  If we make interesting file format changes in the future
    that are configurable by the developer we wouldn't want to change
    all MergePolicy classes to propogate that.  It feels like using
    compound file or not should remain only in IndexWriter?


> Factor merge policy out of IndexWriter
> --------------------------------------
>
>                 Key: LUCENE-847
>                 URL: https://issues.apache.org/jira/browse/LUCENE-847
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Steven Parkes
>         Assigned To: Steven Parkes
>         Attachments: LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it pluggable, making it
possible for apps to choose a custom merge policy and for easier experimenting with merge
policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message