Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 85797 invoked from network); 24 May 2010 10:01:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 May 2010 10:01:03 -0000 Received: (qmail 35647 invoked by uid 500); 24 May 2010 10:01:02 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 35424 invoked by uid 500); 24 May 2010 10:01:02 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 35417 invoked by uid 99); 24 May 2010 10:01:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 May 2010 10:01:01 +0000 X-ASF-Spam-Status: No, hits=-1457.3 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 May 2010 10:01:01 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o4OA0e1p006141 for ; Mon, 24 May 2010 10:00:41 GMT Message-ID: <17421242.6601274695240603.JavaMail.jira@thor> Date: Mon, 24 May 2010 06:00:40 -0400 (EDT) From: "Michael McCandless (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes* In-Reply-To: <10732753.13841273550429349.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870549#action_12870549 ] Michael McCandless commented on LUCENE-2455: -------------------------------------------- bq. Backwards support should be much easier there, because we will provide an index migration tool anyway, and so CFW/CFR can always assume they're reading the latest version (at least in 4.0). Hmm I think we should do live migration for this (ie don't require a migration tool to fix your index)? This is trivial to do on the fly right (ie as you've done in 3.x). bq. CFW should probably use CodecUtils in trunk - it cannot be used in 3x because of how CFW works today - writing a VInt first, while CodecUtils assumes an Int. And I don't think it's healthy to do so much changes on 3x. Hmm yeah because of the live migration I think CodecUtils is not actually a fit here (trunk or 3x). > Some house cleaning in addIndexes* > ---------------------------------- > > Key: LUCENE-2455 > URL: https://issues.apache.org/jira/browse/LUCENE-2455 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Shai Erera > Assignee: Shai Erera > Priority: Trivial > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch > > > Today, the use of addIndexes and addIndexesNoOptimize is confusing - > especially on when to invoke each. Also, addIndexes calls optimize() in > the beginning, but only on the target index. It also includes the > following jdoc statement, which from how I understand the code, is > wrong: _After this completes, the index is optimized._ -- optimize() is > called in the beginning and not in the end. > On the other hand, addIndexesNoOptimize does not call optimize(), and > relies on the MergeScheduler and MergePolicy to handle the merges. > After a short discussion about that on the list (Thanks Mike for the > clarifications!) I understand that there are really two core differences > between the two: > * addIndexes supports IndexReader extensions > * addIndexesNoOptimize performs better > This issue proposes the following: > # Clear up the documentation of each, spelling out the pros/cons of > calling them clearly in the javadocs. > # Rename addIndexesNoOptimize to addIndexes > # Remove optimize() call from addIndexes(IndexReader...) > # Document that clearly in both, w/ a recommendation to call optimize() > before on any of the Directories/Indexes if it's a concern. > That way, we maintain all the flexibility in the API - > addIndexes(IndexReader...) allows for using IR extensions, > addIndexes(Directory...) is considered more efficient, by allowing the > merges to happen concurrently (depending on MS) and also factors in the > MP. So unless you have an IR extension, addDirectories is really the one > you should be using. And you have the freedom to call optimize() before > each if you care about it, or don't if you don't care. Either way, > incurring the cost of optimize() is entirely in the user's hands. > BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler > nor MergePolicy, but rather call SegmentMerger directly. This might be > another place for improvement. I'll look into it, and if it's not too > complicated, I may cover it by this issue as well. If you have any hints > that can give me a good head start on that, please don't be shy :). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org