Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
Received-SPF: softfail (athena.apache.org: transitioning domain of
 adfel70@gmail.com does not designate 54.191.145.13 as permitted sender)
Date: Wed, 6 May 2015 07:55:12 -0700 (MST)
From: adfel70 <adfel70@gmail.com>
To: solr-user@lucene.apache.org
Message-ID: <1430924112131-4204148.post@n3.nabble.com>
In-Reply-To: <554A1785.5040905@elyograg.org>
References: <1430899124657-4204068.post@n3.nabble.com>
 <554A1785.5040905@elyograg.org>
Subject: Re: severe problems with soft and hard commits in a large index
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Thank you for the detailed answer.
How can I decrease the impact of opening a searcher in such a large index?
especially the impact of heap usage that causes OOM.

regarding GC tuning - I am doint that.
here are the params I use:
AggresiveOpts
UseLargePages
ParallelRefProcEnabled
CMSParallelRemarkEnabled
CMSMaxAbortablePrecleanTime=6000
CMDTriggerPermRatio=80
CMSInitiatingOccupancyFraction=70
UseCMSInitiatinOccupancyOnly
CMSFullGCsBeforeCompaction=1
PretenureSizeThreshold=64m
CMSScavengeBeforeRemark
UseConcMarkSweepGC
MaxTenuringThreshold=8
TargetSurvivorRatio=90
SurviorRatio=4
NewRatio=2
Xms16gb
Xmn28gb

any input on this?

How many documents per shard are recommended?
Note that I use nested documents. total collection size is 3 billion docs,
number of parent docs is 600 million. the rest are children.


Shawn Heisey-2 wrote
> On 5/6/2015 1:58 AM, adfel70 wrote:
>> I have a cluster of 16 shards, 3 replicas. the cluster indexed nested
>> documents.
>> it currently has 3 billion documents overall (parent and children).
>> each shard has around 200 million docs. size of each shard is 250GB.
>> this runs on 12 machines. each machine has 4 SSD disks and 4 solr
>> processes.
>> each process has 28GB heap.  each machine has 196GB RAM.
>> 
>> I perform periodic indexing throughout the day. each indexing cycle adds
>> around 1.5 million docs. I keep the indexing load light - 2 processes
>> with
>> bulks of 20 docs.
>> 
>> My use case demands that each indexing cycle will be visible only when
>> the
>> whole cycle finishes.
>> 
>> I tried various methods of using soft and hard commits:
> 
> I personally would configure autoCommit on a five minute (maxTime of
> 300000) interval with openSearcher=false.  The use case you have
> outlined (not seeing changed until the indexing is done) demands that
> you do NOT turn on autoSoftCommit, that you do one manual commit at the
> end of indexing, which could be either a soft commit or a hard commit.
> I would recommend a soft commit.
> 
> Because it is the openSearcher part of a commit that's very expensive,
> you can successfully do autoCommit with openSearcher=false on an
> interval like 10 or 15 seconds and not see much in the way of immediate
> performance loss.  That commit is still not free, not only in terms of
> resources, but in terms of java heap garbage generated.
> 
> The general advice with commits is to do them as infrequently as you
> can, which applies to ANY commit, not just those that make changes
> visible.
> 
>> with all methods I encounter pretty much the same problem:
>> 1. heavy GCs when soft commit is performed (methods 1,2) or when
>> hardcommit
>> opensearcher=true is performed. these GCs cause heavy latency (average
>> latency is 3 secs. latency during the problem is 80secs)
>> 2. if indexing cycles come too often, which causes softcommits or
>> hardcommits(opensearcher=true) occur with a small interval one after
>> another
>> (around 5-10minutes), I start getting many OOM exceptions.
> 
> If you're getting OOM, then either you need to change things so Solr
> requires less heap memory, or you need to increase the heap size.
> Changing things might be either the config or how you use Solr.
> 
> Are you tuning your garbage collection?  With a 28GB heap, tuning is not
> optional.  It's so important that the startup scripts in 5.0 and 5.1
> include it, even though the default max heap is 512MB.
> 
> Let's do some quick math on your memory.  You have four instances of
> Solr on each machine, each with a 28GB heap.  That's 112GB of memory
> allocated to Java.  With 196GB total, you have approximately 84GB of RAM
> left over for caching your index.
> 
> A 16-shard index with three replicas means 48 cores.  Divide that by 12
> machines and that's 4 replicas on each server, presumably one in each
> Solr instance.  You say that the size of each shard is 250GB, so you've
> got about a terabyte of index on each server, but only 84GB of RAM for
> caching.
> 
> Even with SSD, that's not going to be anywhere near enough cache memory
> for good Solr performance.
> 
> All these memory issues, including GC tuning, are discussed on this wiki
> page:
> 
> http://wiki.apache.org/solr/SolrPerformanceProblems
> 
> One additional note: By my calculations, each filterCache entry will be
> at least 23MB in size.  This means that if you are using the filterCache
> and the G1 collector, you will not be able to avoid humongous
> allocations, which is any allocation larger than half the G1 region
> size.  The max configurable G1 region size is 32MB.  You should use the
> CMS collector for your GC tuning, not G1.  If you can reduce the number
> of documents in each shard, G1 might work well.
> 
> Thanks,
> Shawn


--
View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068p4204148.html
Sent from the Solr - User mailing list archive at Nabble.com.