Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1A4AC17648 for ; Wed, 6 May 2015 14:56:14 +0000 (UTC) Received: (qmail 72483 invoked by uid 500); 6 May 2015 14:56:09 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 72439 invoked by uid 500); 6 May 2015 14:56:09 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 72428 invoked by uid 99); 6 May 2015 14:56:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 May 2015 14:56:08 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,SPF_SOFTFAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: softfail (athena.apache.org: transitioning domain of adfel70@gmail.com does not designate 54.191.145.13 as permitted sender) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 May 2015 14:56:02 +0000 Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id 73C5C22F86 for ; Wed, 6 May 2015 14:55:42 +0000 (UTC) Received: from mben.nabble.com (unknown [162.253.133.72]) by mwork.nabble.com (Postfix) with ESMTP id 961AB1D566A8 for ; Wed, 6 May 2015 07:55:59 -0700 (PDT) Date: Wed, 6 May 2015 07:55:12 -0700 (MST) From: adfel70 To: solr-user@lucene.apache.org Message-ID: <1430924112131-4204148.post@n3.nabble.com> In-Reply-To: <554A1785.5040905@elyograg.org> References: <1430899124657-4204068.post@n3.nabble.com> <554A1785.5040905@elyograg.org> Subject: Re: severe problems with soft and hard commits in a large index MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thank you for the detailed answer. How can I decrease the impact of opening a searcher in such a large index? especially the impact of heap usage that causes OOM. regarding GC tuning - I am doint that. here are the params I use: AggresiveOpts UseLargePages ParallelRefProcEnabled CMSParallelRemarkEnabled CMSMaxAbortablePrecleanTime=6000 CMDTriggerPermRatio=80 CMSInitiatingOccupancyFraction=70 UseCMSInitiatinOccupancyOnly CMSFullGCsBeforeCompaction=1 PretenureSizeThreshold=64m CMSScavengeBeforeRemark UseConcMarkSweepGC MaxTenuringThreshold=8 TargetSurvivorRatio=90 SurviorRatio=4 NewRatio=2 Xms16gb Xmn28gb any input on this? How many documents per shard are recommended? Note that I use nested documents. total collection size is 3 billion docs, number of parent docs is 600 million. the rest are children. Shawn Heisey-2 wrote > On 5/6/2015 1:58 AM, adfel70 wrote: >> I have a cluster of 16 shards, 3 replicas. the cluster indexed nested >> documents. >> it currently has 3 billion documents overall (parent and children). >> each shard has around 200 million docs. size of each shard is 250GB. >> this runs on 12 machines. each machine has 4 SSD disks and 4 solr >> processes. >> each process has 28GB heap. each machine has 196GB RAM. >> >> I perform periodic indexing throughout the day. each indexing cycle adds >> around 1.5 million docs. I keep the indexing load light - 2 processes >> with >> bulks of 20 docs. >> >> My use case demands that each indexing cycle will be visible only when >> the >> whole cycle finishes. >> >> I tried various methods of using soft and hard commits: > > I personally would configure autoCommit on a five minute (maxTime of > 300000) interval with openSearcher=false. The use case you have > outlined (not seeing changed until the indexing is done) demands that > you do NOT turn on autoSoftCommit, that you do one manual commit at the > end of indexing, which could be either a soft commit or a hard commit. > I would recommend a soft commit. > > Because it is the openSearcher part of a commit that's very expensive, > you can successfully do autoCommit with openSearcher=false on an > interval like 10 or 15 seconds and not see much in the way of immediate > performance loss. That commit is still not free, not only in terms of > resources, but in terms of java heap garbage generated. > > The general advice with commits is to do them as infrequently as you > can, which applies to ANY commit, not just those that make changes > visible. > >> with all methods I encounter pretty much the same problem: >> 1. heavy GCs when soft commit is performed (methods 1,2) or when >> hardcommit >> opensearcher=true is performed. these GCs cause heavy latency (average >> latency is 3 secs. latency during the problem is 80secs) >> 2. if indexing cycles come too often, which causes softcommits or >> hardcommits(opensearcher=true) occur with a small interval one after >> another >> (around 5-10minutes), I start getting many OOM exceptions. > > If you're getting OOM, then either you need to change things so Solr > requires less heap memory, or you need to increase the heap size. > Changing things might be either the config or how you use Solr. > > Are you tuning your garbage collection? With a 28GB heap, tuning is not > optional. It's so important that the startup scripts in 5.0 and 5.1 > include it, even though the default max heap is 512MB. > > Let's do some quick math on your memory. You have four instances of > Solr on each machine, each with a 28GB heap. That's 112GB of memory > allocated to Java. With 196GB total, you have approximately 84GB of RAM > left over for caching your index. > > A 16-shard index with three replicas means 48 cores. Divide that by 12 > machines and that's 4 replicas on each server, presumably one in each > Solr instance. You say that the size of each shard is 250GB, so you've > got about a terabyte of index on each server, but only 84GB of RAM for > caching. > > Even with SSD, that's not going to be anywhere near enough cache memory > for good Solr performance. > > All these memory issues, including GC tuning, are discussed on this wiki > page: > > http://wiki.apache.org/solr/SolrPerformanceProblems > > One additional note: By my calculations, each filterCache entry will be > at least 23MB in size. This means that if you are using the filterCache > and the G1 collector, you will not be able to avoid humongous > allocations, which is any allocation larger than half the G1 region > size. The max configurable G1 region size is 32MB. You should use the > CMS collector for your GC tuning, not G1. If you can reduce the number > of documents in each shard, G1 might work well. > > Thanks, > Shawn -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068p4204148.html Sent from the Solr - User mailing list archive at Nabble.com.