lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3454) rename optimize to a less cool-sounding name
Date Sun, 06 Nov 2011 19:03:51 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145078#comment-13145078
] 

Shai Erera commented on LUCENE-3454:
------------------------------------

{quote}
How about the name "forceMerge(int)" instead?

Fundamentally, this is a different operation from maybeMerge() because
that method only does "natural" merges, ie ones that the MP has
selected on its own.

Whereas forceMerge means you are forcing the MP to do merging that it
otherwise would not have naturally chosen to do.
{quote}

I'm not sure that I agree ... I could set MP in such a way that forceMerge(1) would still
do nothing. That's very simple in fact, and I do this today. I set LogMP's maxMergeMB(ForOptimize)
to 4GB, which means that I never end up merging segments larger than that. I call optimize()
whenever I can, but at some point, optimize will do nothing (if the indexing process stops),
or after my index grew a lot, many segments won't be merged, and optimize/forceMerge(1) will
actually end up with X+1 segments, where X is the number of segments that are too large for
me to merge.

Therefore I'm not sure that trading optimize for forceMerge is much better. Sure, it has a
less cool name, but now I think it will be even more confusing, because I'll call forceMerge(1)
and that won't do what I asked.

I think that the problem is that we try to come up with names that reflect what API we IndexWriter
should call on MP. That's why we try to distinguish between maybeMerge() and optimize(int).
So maybe we should go for a more extreme change -- how about having one method merge() which
takes a MergePolicy with a single method findSegmentsForMerge(). We will provide MPs that
are good for 'regular' merges and 'optimize' and the user can pass whatever he wishes to do.
The user can also pass the same MP instance to IWC, and that will control the regular merges
IW does from time to time (we default to a 'regular' merging MP).

Just a thought.
                
> rename optimize to a less cool-sounding name
> --------------------------------------------
>
>                 Key: LUCENE-3454
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3454
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.4, 4.0
>            Reporter: Robert Muir
>            Assignee: Michael McCandless
>         Attachments: LUCENE-3454.patch
>
>
> I think users see the name optimize and feel they must do this, because who wants a suboptimal
system? but this probably just results in wasted time and resources.
> maybe rename to collapseSegments or something?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message