Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 10982 invoked from network); 26 May 2010 12:10:55 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 May 2010 12:10:55 -0000 Received: (qmail 65969 invoked by uid 500); 26 May 2010 12:10:54 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 65858 invoked by uid 500); 26 May 2010 12:10:53 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 65851 invoked by uid 99); 26 May 2010 12:10:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 May 2010 12:10:53 +0000 X-ASF-Spam-Status: No, hits=-1466.7 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 May 2010 12:10:52 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o4QCAWCI008061 for ; Wed, 26 May 2010 12:10:32 GMT Message-ID: <10150908.58441274875832037.JavaMail.jira@thor> Date: Wed, 26 May 2010 08:10:32 -0400 (EDT) From: "Shai Erera (JIRA)" To: dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes* In-Reply-To: <10732753.13841273550429349.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871639#action_12871639 ] Shai Erera commented on LUCENE-2455: ------------------------------------ Ok I added the indexes from trunk (didn't know they were there). I've changed CFS to write a version header in the file, so that's why I've added a 3.0 index - to make sure it can be read properly by 3.1. What I've added to TestBackwardsCompatibility are tests to ensure that addIndexes work on old indexes (which was good, because after the changes they weren't !). bq. Maybe simple delete, they are not used. The testAddIndexes were just added, and the 30 indexes are used. So I cannot delete them (see my comment above) bq. By the way the 3.0 index zip file generation code is in the 3.0 branch, have you edited it there? Nope, it exists in TestBackwardsCompatibility as commented out, w/ instructions to uncomment. I've used that code. > Some house cleaning in addIndexes* > ---------------------------------- > > Key: LUCENE-2455 > URL: https://issues.apache.org/jira/browse/LUCENE-2455 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Reporter: Shai Erera > Assignee: Shai Erera > Priority: Trivial > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch > > > Today, the use of addIndexes and addIndexesNoOptimize is confusing - > especially on when to invoke each. Also, addIndexes calls optimize() in > the beginning, but only on the target index. It also includes the > following jdoc statement, which from how I understand the code, is > wrong: _After this completes, the index is optimized._ -- optimize() is > called in the beginning and not in the end. > On the other hand, addIndexesNoOptimize does not call optimize(), and > relies on the MergeScheduler and MergePolicy to handle the merges. > After a short discussion about that on the list (Thanks Mike for the > clarifications!) I understand that there are really two core differences > between the two: > * addIndexes supports IndexReader extensions > * addIndexesNoOptimize performs better > This issue proposes the following: > # Clear up the documentation of each, spelling out the pros/cons of > calling them clearly in the javadocs. > # Rename addIndexesNoOptimize to addIndexes > # Remove optimize() call from addIndexes(IndexReader...) > # Document that clearly in both, w/ a recommendation to call optimize() > before on any of the Directories/Indexes if it's a concern. > That way, we maintain all the flexibility in the API - > addIndexes(IndexReader...) allows for using IR extensions, > addIndexes(Directory...) is considered more efficient, by allowing the > merges to happen concurrently (depending on MS) and also factors in the > MP. So unless you have an IR extension, addDirectories is really the one > you should be using. And you have the freedom to call optimize() before > each if you care about it, or don't if you don't care. Either way, > incurring the cost of optimize() is entirely in the user's hands. > BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler > nor MergePolicy, but rather call SegmentMerger directly. This might be > another place for improvement. I'll look into it, and if it's not too > complicated, I may cover it by this issue as well. If you have any hints > that can give me a good head start on that, please don't be shy :). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org