Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 49776D37F for ; Fri, 26 Oct 2012 19:14:00 +0000 (UTC) Received: (qmail 21094 invoked by uid 500); 26 Oct 2012 19:13:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 21047 invoked by uid 500); 26 Oct 2012 19:13:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 21039 invoked by uid 99); 26 Oct 2012 19:13:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Oct 2012 19:13:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of vfunstein@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-la0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Oct 2012 19:13:51 +0000 Received: by mail-la0-f48.google.com with SMTP id u2so2931322lag.35 for ; Fri, 26 Oct 2012 12:13:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=I9yNAyo9zUdXHROo3n2B1PWIAayw3jIeFUsT7L2hLuY=; b=GRxP+BVXPiNknWeNOnXR6bfjt1QGmDvuoUtftVWI3duV0q10s58a5w1WDINkx88x8L VRe955N0y+JvkpKwvNgQJqRYQXNPSzf4UeG01xGGJ3GrV+pOtJktPTBpYwZnilcznyvY +QaGiWfZZgqzPweXpI72QcrWBApdMO5VCHu6VpS3e7GNfB1IMn1xskcI7P/Nex62V7kg GV1tVFjRv4/1u9X1QY7KlCZDVKacTm0G0tQob+tAyhluEs4djA3sf3nA2Ar8eJ5iqfdK 3y899f7nP1wDiXVGMQyF7uUtj3RHm6kt+LZ6bj207JH6QBprD9lpEaXnBVyKNYmrxgWe QA8Q== MIME-Version: 1.0 Received: by 10.112.36.200 with SMTP id s8mr9431191lbj.92.1351278809962; Fri, 26 Oct 2012 12:13:29 -0700 (PDT) Received: by 10.114.13.161 with HTTP; Fri, 26 Oct 2012 12:13:29 -0700 (PDT) In-Reply-To: <1351273804.28138.YahooMailNeo@web120702.mail.ne1.yahoo.com> References: <1351273804.28138.YahooMailNeo@web120702.mail.ne1.yahoo.com> Date: Fri, 26 Oct 2012 12:13:29 -0700 Message-ID: Subject: Re: Lucene 3.6.0 Index Size From: Vitaly Funstein To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=e0cb4efe2dcc75d98504ccfb1d57 X-Virus-Checked: Checked by ClamAV on apache.org --e0cb4efe2dcc75d98504ccfb1d57 Content-Type: text/plain; charset=ISO-8859-1 One thing to keep in mind is that the default merge policy has changed in 3.6 from 2.3.2 (I'm almost certain of that). So it's just a hunch but you may have some unmerged segments left over at the end. Try calling IndexWriter.close(true) after you're done indexing. On Fri, Oct 26, 2012 at 10:50 AM, kiwi clive wrote: > Hello. > > We have an index that when creted using lucene2.3.2, has a size of about > 4G. > > Creating the same index (with the same parameters) with lucene 3.6.0 > results in an 11G index. > > Could someone shed some light into why the index is so much larger, given > the same data and the same parameters? > > I realize this is a large version jump but a doubling in index size does > not seem a step in the right direction to me ;-) > > I am using cfs format. > > Thanks, > Clive > --e0cb4efe2dcc75d98504ccfb1d57--