lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Large index recommendation
Date Fri, 13 Jan 2017 19:28:24 GMT
In any case, this is really "the sizing question" and generic answers
are not reliable. Here's a long blog about why, but the net-net is
"prototype and measure". Fortunately you can prototype with just a few
nodes (I usually want at least 2 shards) and extrapolate reasonably


On Fri, Jan 13, 2017 at 10:29 AM, Susheel Kumar <> wrote:
> As per Scott@FullStory you shall see benefits with many smaller shards then
> few bigger. Also upgrading to Solr 6.2 would be better as there are many
> improvements done handling multiple shards. See below presentation
> Thnx
> Susheel
> On Fri, Jan 13, 2017 at 12:56 PM, Joe Obernberger <
>> wrote:
>> Hi All - we've been experimenting with Solr Cloud 5.5.0 with a 27 shard
>> (no replication - each shard runs on a physical host) cluster on top of
>> HDFS.  It currently just crossed 3 billion documents indexed with an index
>> size of 16.1TBytes.  In HDFS with 3x replication this takes up 48.2TBytes.
>> Each shard is then hosting about 610GBytes of index.  The HDFS cache size
>> is very low at about 8GBytes.  Suffice it to say, performance isn't very
>> good, but again, this is for experimentation.
>> If we were to redo this, would it be better to create many shards - maybe
>> 200 with 3 replicas each (600 in all) with the goal being to withstand a
>> server going out, and future expansion as more hardware is added?  I know
>> this is very general question.  Thanks very much in advance!
>> -Joe

View raw message