Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 78024 invoked from network); 18 May 2010 14:23:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 May 2010 14:23:25 -0000 Received: (qmail 44099 invoked by uid 500); 18 May 2010 14:23:24 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 44052 invoked by uid 500); 18 May 2010 14:23:24 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 44045 invoked by uid 99); 18 May 2010 14:23:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 14:23:24 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=10.0 tests=AWL,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL,T_FRT_ADULT2 X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.211.194] (HELO mail-yw0-f194.google.com) (209.85.211.194) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 14:23:19 +0000 Received: by ywh32 with SMTP id 32so2853984ywh.5 for ; Tue, 18 May 2010 07:22:58 -0700 (PDT) MIME-Version: 1.0 Received: by 10.151.1.27 with SMTP id d27mr7321832ybi.304.1274192577855; Tue, 18 May 2010 07:22:57 -0700 (PDT) Received: by 10.151.11.20 with HTTP; Tue, 18 May 2010 07:22:57 -0700 (PDT) In-Reply-To: <4BF2A032.3080309mkm@r.email.ne.jp> References: <1367215.97021274150081968.JavaMail.jira@thor> <4BF2A032.3080309mkm@r.email.ne.jp> Date: Tue, 18 May 2010 10:22:57 -0400 Message-ID: Subject: Re: (LUCENE-2257) relax the per-segment max unique term limit From: Michael McCandless To: dev@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Duh, sorry, that should have been "but on stable (3x) the limit is across all fields". On trunk (=3D flex) the limit is per-field. Mike On Tue, May 18, 2010 at 10:12 AM, Koji Sekiguchi wrote= : > =A0> but in trunk, the limit is across all fields > > Got it. Thanks, Mike! > > Koji > > -- > http://www.rondhuit.com/en/ > > > (10/05/18 18:21), Michael McCandless (JIRA) wrote: >> =A0 =A0 =A0[ https://issues.apache.org/jira/browse/LUCENE-2257?page=3Dco= m.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedComme= ntId=3D12868569#action_12868569 ] >> >> Michael McCandless commented on LUCENE-2257: >> -------------------------------------------- >> >> Yes, the limit is number of unique terms per-segment. >> >> Flex actually increases the limit (the limit is per-field, per-segment; = but in trunk, the limit is across all fields). >> >> >>> relax the per-segment max unique term limit >>> ------------------------------------------- >>> >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Key: LUCENE-2257 >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0URL: https://issues.apache.org/jira/= browse/LUCENE-2257 >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0Project: Lucene - Java >>> =A0 =A0 =A0 =A0 =A0 Issue Type: Improvement >>> =A0 =A0 =A0 =A0 =A0 =A0 Reporter: Michael McCandless >>> =A0 =A0 =A0 =A0 =A0 =A0 Assignee: Michael McCandless >>> =A0 =A0 =A0 =A0 =A0 =A0 Priority: Minor >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0Fix For: 2.9.2, 3.0.1, 4.0 >>> >>> =A0 =A0 =A0 =A0 =A0Attachments: LUCENE-2257.patch, LUCENE-2257.patch >>> >>> >>> Lucene can't handle more than 2.1B (limit of signed 32 bit int) unique = terms in a single segment. >>> But I think we can improve this to termIndexInterval (default 128) * 2.= 1B. =A0There is one place (internal API only) where Lucene uses an int but = should use a long. >>> >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org