Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DBF3B11C1F for ; Tue, 22 Apr 2014 16:43:39 +0000 (UTC) Received: (qmail 83786 invoked by uid 500); 22 Apr 2014 16:43:35 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 83686 invoked by uid 500); 22 Apr 2014 16:43:35 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 83678 invoked by uid 99); 22 Apr 2014 16:43:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2014 16:43:34 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of solr@elyograg.org designates 166.70.79.219 as permitted sender) Received: from [166.70.79.219] (HELO frodo.elyograg.org) (166.70.79.219) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Apr 2014 16:43:28 +0000 Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id D65413926 for ; Tue, 22 Apr 2014 10:43:05 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1398184985; bh=S2Pyr1JEDHhAGlqzk+BPDx69rEH+UOChB8Q+AukHNE4=; b=Hvz/UA8L/qUT 0XLqDeUuSBvBLWOH3a/QoMOJ3TroMIAmPa6JAvI4EXPLK9cOwNWz+omaHUVw+nuH mN3FqxErlSnVRAc98kzQcOURx5pMS0fCQwDF0P6woS7uepo2SnvohYSsbTenSRSB 2jAIGtZGBBipa6/LqtGZ4cDGdbyBnys= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Vx-FovuDFhIw for ; Tue, 22 Apr 2014 10:43:05 -0600 (MDT) Received: from [10.7.5.151] (client175.mainstreamdata.com [209.63.42.175]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id 11690AC1 for ; Tue, 22 Apr 2014 10:43:04 -0600 (MDT) Message-ID: <53569C15.5090501@elyograg.org> Date: Tue, 22 Apr 2014 10:43:01 -0600 From: Shawn Heisey User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: maximum number of shards per SolrCloud References: <89280C92-3782-450C-B8DF-B35D5ED0DA80@gmail.com> In-Reply-To: <89280C92-3782-450C-B8DF-B35D5ED0DA80@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 4/22/2014 10:02 AM, yypvsxf19870706 wrote: > I am curious of the influences when have more than 2G docs in a core.And we plan to have 5g docs/core. > > Please give me some suggestion about how to plan num of docs in a core ? One Solr core contains one Lucene index. It can't be divided further than that without a significant redesign. Quick note: Although SolrCloud can handle five billion documents with no problem, you can't have five billion documents in a single shard/core. The only hard limitation in the entire system is that you can't have more than approximately 2 billion documents in a single Lucene index. This is because a Java integer (which is a signed 32-bit number) is what gets used for internal Lucene document identifiers. Deleted documents count against that limit. It is theoretically possible to overcome this limitation, but it would be a MAJOR change to Lucene, requiring major changes in Solr as well. The other limitations you can run into with a large SolrCloud are mostly a matter of configuration, system resources, and scaling to multiple servers. They are not hard limitations in the software. I would never put more than about 1 billion documents in a single core. For performance reasons, it would be a good idea to never exceed a few hundred million. When a high query rate is required, loading only one Solr core per server may be a requirement. Thanks, Shawn