Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 05C313037 for ; Sun, 8 May 2011 18:31:47 +0000 (UTC) Received: (qmail 1640 invoked by uid 500); 8 May 2011 18:31:45 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 1585 invoked by uid 500); 8 May 2011 18:31:45 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 1576 invoked by uid 99); 8 May 2011 18:31:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 May 2011 18:31:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 May 2011 18:31:42 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 23919C68B8 for ; Sun, 8 May 2011 18:31:03 +0000 (UTC) Date: Sun, 8 May 2011 18:31:03 +0000 (UTC) From: "Uwe Schindler (JIRA)" To: dev@lucene.apache.org Message-ID: <893972025.31167.1304879463142.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1376505001.30999.1304860203092.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Issue Comment Edited] (LUCENE-3082) Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030519#comment-13030519 ] Uwe Schindler edited comment on LUCENE-3082 at 5/8/11 6:29 PM: --------------------------------------------------------------- Patch that implements this with a merge policy: It does not yet contain the command line updater, if you want to upgrade an old index, the API code to do this is very simple: {code:java} IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_XX, new KeywordAnalyzer()); iwc = iwc.setMergePolicy(new UpgradeIndexMergePolicy(iwc.getMergePolicy())); IndexWriter w = new IndexWriter(dir, iwc); w.optimize(); w.close(); {code} The patch contains new tests in TestBackwards that verify the upgrade process: - It tries to upgrade all old indexes from the well-known list in TestBackwards. When this is done, all of them should contain exactly one segment (because all segments previously in index are older version, so they are merged/optimized together in new format). It also verifies all segment versions to be Constants.LUCENE_MAIN_VERSION. - It tries to upgrade two old, already optimized indexes (with prev version, I changed TestBackwards in my 3.1 checkout to generate those). It verifies the segment versions after the upgrade. This special case is needed, as optimizing a one-segment index is a no-op without the special merge-policy - It uses the old optimized indexes, opens them using standard merge policy and adds some documents to them. After that it upgrades the index with a new IndexWriter using the special merge policy. In that case (as some segments are already in new version), the index should only have the old-segments merged together, the newly added ones are untouched. So segment is verified to be count > 1. was (Author: thetaphi): Path that implements this with a merge policy: It does not yet contain the command line updater, if you want to upgrade an old index, the API code to do this is very simple: {code:java} IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_XX, new KeywordAnalyzer()); iwc = iwc.setMergePolicy(new UpgradeIndexMergePolicy(iwc.getMergePolicy())); IndexWriter w = new IndexWriter(dir, iwc); w.optimize(); w.close(); {code} The patch contains new tests in TestBackwards that verify the upgrade process: - It tries to upgrade all old segments in the well-known list. When this is done, all of them should contain exactly one segment (because all segments previously in index are older version, so they are merged/optimized together in new format). It also verifies all segment versions to be Constants.LUCENE_MAIN_VERSION. - It tries to upgrade two old, already optimized indexes (with prev version, I changed TestBackwards in my 3.1 checkout to generate those). It verifies the segment versions after the upgrade. This special case is needed, as optimizing a one-segment index is a no-op without the special merge-policy - It uses the old optimized indexes, opens them using standard merge policy and adds some documents to them. After that it upgrades the index, in that case (as some segments are already in new version), the index should only have the old-segments merged together, the newly added ones are untouched. So segment count > 1 > Add index upgrade method to IndexWriter to force an upgrade of all segments to last recent supported index format without optimizing > ------------------------------------------------------------------------------------------------------------------------------------ > > Key: LUCENE-3082 > URL: https://issues.apache.org/jira/browse/LUCENE-3082 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Reporter: Uwe Schindler > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3082.patch, index.31.optimized.cfs.zip, index.31.optimized.nocfs.zip > > > Currently if you want to upgrade an old index to the format of your current Lucene version, you have to optimize your index or use addIndexes(IndexReader...) [see LUCENE-2893] to copy to a new directory. The optimize() approach fails if your index is already optimized. > I propose to add a method to IndexWriter thats similar to optimize(), that uses a custom MergePolicy to upgrade all segments to the last format. This MergePolicy could simply also ignore all segments already up-to-date. All segments in prior formats would be merged to a new segment. The tool could optionally also optimize the index. > This issue is different from LUCENE-2893, as it would only support upgrading indexes from previous Lucene versions in-place using the official path. Its a tool for the end user, not a developer tool. > This addition should also go to Lucene 3.x, as we need to make users with pre-3.0 indexes go the step through 3.x, else they would not be able to open their index with 4.0. With this tool in 3.x the users could safely upgrade their index without relying on optimize to work on already-optimized indexes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org