Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E9C7D4E3D for ; Mon, 23 May 2011 15:08:29 +0000 (UTC) Received: (qmail 63264 invoked by uid 500); 23 May 2011 15:08:28 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 63216 invoked by uid 500); 23 May 2011 15:08:28 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 63209 invoked by uid 99); 23 May 2011 15:08:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 May 2011 15:08:28 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 May 2011 15:08:27 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 96472D9861 for ; Mon, 23 May 2011 15:07:47 +0000 (UTC) Date: Mon, 23 May 2011 15:07:47 +0000 (UTC) From: "Shai Erera (JIRA)" To: dev@lucene.apache.org Message-ID: <934163933.36158.1306163267597.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1927795489.28239.1305839927505.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037983#comment-13037983 ] Shai Erera commented on LUCENE-3126: ------------------------------------ Patch does not handle all files well (few tests fail). Apparently, the .del file should not be rolled into the .cfs. SegmentMerger.createCompoundFile does this by default, however it's only called from code that ensures no deletions exist. Would have been nice if this method documented it :). Also, I think *.s should not be rolled into .cfs (those are the separate norms files). I don't know how to create such files in the first place (thought they're of old format, but 3.1 indexes have them also), and TestBackCompat fails. Is there a way to identify those files? Is it safe to check if the file extension starts w/ IndexFileNames.SEPARATE_NORMS_EXTENSION? Feels hacky to me. Another thing, I think in order to avoid shared doc stores (and whatever other old-format) stuff, since it's only an optimization, that the code should copy into CFS only if the segment version is on or after 3.1 (that is StringHelper.getVersionComparator().compare(info.getVersion, "3.1") >= 0). I think I'm close to finish it, just need to figure out the separate norms thing. > IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already > --------------------------------------------------------------------------------- > > Key: LUCENE-3126 > URL: https://issues.apache.org/jira/browse/LUCENE-3126 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Shai Erera > Assignee: Shai Erera > Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3126.patch > > > Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming segments. However, if IndexWriter's MP wants to create CFS (in general), there's no reason why not turn the incoming non-CFS segments into CFS. We anyway copy them, and if MP is not against CFS, we should create a CFS out of them. > Will need to use CFW, not sure it's ready for that w/ current API (I'll need to check), but luckily we're allowed to change it (@lucene.internal). > This should be done, IMO, even if the incoming segment is large (i.e., passes MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you think otherwise, speak up :). > I'll take a look at this in the next few days. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org