Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 45875 invoked from network); 24 Feb 2005 08:27:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 24 Feb 2005 08:27:03 -0000 Received: (qmail 43282 invoked by uid 500); 24 Feb 2005 08:26:58 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 43261 invoked by uid 500); 24 Feb 2005 08:26:58 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 43242 invoked by uid 99); 24 Feb 2005 08:26:58 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from Unknown (HELO mail.sofari.com) (12.43.53.196) by apache.org (qpsmtpd/0.28) with ESMTP; Thu, 24 Feb 2005 00:26:58 -0800 Received: from [192.165.110.65] (www.rojo.com [66.180.233.2] (may be forged)) by mail.sofari.com (8.12.11/8.12.3/Debian-6.6) with ESMTP id j1O8Qo45032144 for ; Thu, 24 Feb 2005 00:26:56 -0800 Message-ID: <421D8FC5.4030100@newsmonster.org> Date: Thu, 24 Feb 2005 00:26:45 -0800 From: "Kevin A. Burton" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lucene Users List Subject: Re: Possible to mix/match indexes with diff TermInfosWriter.INDEX_INTERVAL ?? References: <421D89B6.20804@newsmonster.org> In-Reply-To: <421D89B6.20804@newsmonster.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Kevin A. Burton wrote: > I finally had some time to take Doug's advice and reburn our indexes > with a larger TermInfosWriter.INDEX_INTERVAL value. You know... it looks like the problem is that TermInfosReader uses INDEX_INTERVAL during seeks and is probably just jumping RIGHT past the offsets that I need. If this is going to be a practical way of reducing Lucene memory footprint for HUGE indexes then its going to need a way to change this value based on the current index thats being opened. Is there anyway to determine the INDEX_INTERVAL from the file? It looks according to: http://jakarta.apache.org/lucene/docs/fileformats.html That the .tis file (which according to the docs the .tii file "is very similar to the .tis file" ) should have this data: So according to this: TermInfoFile (.tis)--> TIVersion, TermCount, IndexInterval, SkipInterval, TermInfos The only problem is that the .tii and .tis files I have on disk don't have a constant preamble and doesnt' look like there's an index interval here... Kevin -- Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an invite! Also see irc.freenode.net #rojo if you want to chat. Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html If you're interested in RSS, Weblogs, Social Networking, etc... then you should work for Rojo! If you recommend someone and we hire them you'll get a free iPod! Kevin A. Burton, Location - San Francisco, CA AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org