Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 23723 invoked from network); 1 May 2008 20:11:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 May 2008 20:11:18 -0000 Received: (qmail 77844 invoked by uid 500); 1 May 2008 20:11:12 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 77814 invoked by uid 500); 1 May 2008 20:11:11 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 77803 invoked by uid 99); 1 May 2008 20:11:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 May 2008 13:11:11 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of john.wang@gmail.com designates 209.85.128.191 as permitted sender) Received: from [209.85.128.191] (HELO fk-out-0910.google.com) (209.85.128.191) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 May 2008 20:10:27 +0000 Received: by fk-out-0910.google.com with SMTP id 18so860497fkq.5 for ; Thu, 01 May 2008 13:10:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=yu5P6P7XYdyWAsH8s+FhPIuDibaxCaBF4deS5VS3t64=; b=US5/TCjT+7pw5NwZCcpXcYAPcz/vDjQErGiuJ4ZzrO7QOPH5WY3+JhWqZxVSh7hQ6NcTu1bc5DzfYWNN6NSC8nwIVSoggy3/mJPM3fw9+ah0KWdUUoyj+S7Bd4rqeFYAUi0O+zcHHoARLqPJLmXL5wAwNvY+UcNpa8EgOsatf8g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=QaVICEAn6HfBPzZTzC5eIhhJF1DLub/QFvQFOsB/D9GE69WsQdDee/fNWSLvW0Ua3dDqo8vYn6RAC36ICMtgqgPo2sEEh/l2lsZPpqJ+Xm2yg8g2MBpCVHsLgCCNoiw5/XxsIOAM69wnlnWsP9c5+Na9hO3BH2HplqSR4Ztof7Y= Received: by 10.82.152.16 with SMTP id z16mr269614bud.70.1209672639918; Thu, 01 May 2008 13:10:39 -0700 (PDT) Received: by 10.82.141.2 with HTTP; Thu, 1 May 2008 13:10:39 -0700 (PDT) Message-ID: <8837fb770805011310p5fb7b1e7wcc75ab9992fed6fe@mail.gmail.com> Date: Thu, 1 May 2008 13:10:39 -0700 From: "John Wang" To: java-user@lucene.apache.org Subject: Re: Does Lucene Supports Billions of data In-Reply-To: <200805010910.04480.daniel@nuix.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_7011_29403196.1209672639913" References: <16974808.post@talk.nabble.com> <5e76f3840804300646y7d230aa3xb7827193fe441b4d@mail.gmail.com> <8837fb770804300701v74791ce0h6c1bf359b801e473@mail.gmail.com> <200805010910.04480.daniel@nuix.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_7011_29403196.1209672639913 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I am not sure why this is the case, docid is internal to the sub index. As long as the sub index size is below 2 bil, there is no need for docid to be long. With multiple indexes, I was thinking having an aggregater which merges maybe only a page of search result. Example: sub index 1: 1 billion docs sub index 2: 1 billion docs sub index 3: 1 billion docs federating search to these subindexes, you represent an index of 3 billion docs, and all internal doc ids are of type int. Maybe I am not understanding something. -John On Wed, Apr 30, 2008 at 4:10 PM, Daniel Noll wrote: > On Thursday 01 May 2008 00:01:48 John Wang wrote: > > I am not sure how well lucene would perform with > 2 Billion docs in a > > single index anyway. > > Even if they're in multiple indexes, the doc IDs being ints will still > prevent > it going past 2Gi unless you wrap your own framework around it. > > Daniel > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_7011_29403196.1209672639913--