Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 83973 invoked from network); 4 Apr 2009 11:55:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Apr 2009 11:55:45 -0000 Received: (qmail 16726 invoked by uid 500); 4 Apr 2009 11:55:43 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 16673 invoked by uid 500); 4 Apr 2009 11:55:43 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 16663 invoked by uid 99); 4 Apr 2009 11:55:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Apr 2009 11:55:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Murat.Yakici@cis.strath.ac.uk designates 130.159.196.96 as permitted sender) Received: from [130.159.196.96] (HELO smtphost.cis.strath.ac.uk) (130.159.196.96) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Apr 2009 11:55:35 +0000 Received: from webmail.cis.strath.ac.uk (www-data@ramsay.cis.strath.ac.uk [130.159.196.85]) by arnott.cis.strath.ac.uk (8.14.2/8.14.2/Debian-2build1) with ESMTP id n34Bt7X2020689 for ; Sat, 4 Apr 2009 12:55:07 +0100 Received: from 93.97.155.28 (SquirrelMail authenticated user murat) by webmail.cis.strath.ac.uk with HTTP; Sat, 4 Apr 2009 12:55:10 +0100 (BST) Message-ID: <52193.93.97.155.28.1238846110.squirrel@webmail.cis.strath.ac.uk> In-Reply-To: <9ac0c6aa0904040303m2bde366bof91a60a92cc6a198@mail.gmail.com> References: <8fdf49110904032356s2cf88d02hedef29300af2af2c@mail.gmail.com> <9ac0c6aa0904040303m2bde366bof91a60a92cc6a198@mail.gmail.com> Date: Sat, 4 Apr 2009 12:55:10 +0100 (BST) Subject: Re: Term Limit? From: "Murat Yakici" To: java-user@lucene.apache.org User-Agent: SquirrelMail/1.4.13 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-15 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-CIS-MailScanner-SpamCheck: SpamAssassin score=-1.44, ALL_TRUSTED X-Scanned-By: MIMEDefang on 130.159.196.96 X-Virus-Checked: Checked by ClamAV on apache.org I assume the total number of documents that you can index is also limited by Java max int. Is this correct? Is there any way to index documents beyond this number in a single index? Murat > I tentatively think you are correct: the file format itself does not > impose this limitation. > > But in a least a couple places internally, Lucene uses a java int to > hold the term number, which is actually a limit of 2,147,483,648 > terms. I'll update fileformats.html for 2.9. > > Mike > > On Sat, Apr 4, 2009 at 2:56 AM, deminix wrote: >> http://lucene.apache.org/java/2_4_1/fileformats.html >> >> The file format page at the bottom cites that there is a 32 bit limit to >> term numbers. �I fail to see where in the file formats documentation >> that is >> actually true. �Is the bottom of the page simply out of date? �I'm also >> wondering whether the code may be a limiting factor even if the file >> formats >> are ok. >> >> Thanks. >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > Murat Yakici Department of Computer & Information Sciences University of Strathclyde Glasgow, UK ------------------------------------------- The University of Strathclyde is a charitable body, registered in Scotland, with registration number SC015263. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org