lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Potter <thelabd...@gmail.com>
Subject Re: Planning ahead for Solr Cloud and Scaling
Date Wed, 09 Jul 2014 14:58:54 GMT
Hi Zane,

re 1: as an alternative to shard splitting, you can just overshard the
collection from the start and then migrate existing shards to new
hardware as needed. The migrate can happen online, see collection API
ADDREPLICA. Once the new replica is online on the new hardware, you
can unload the older replica on your original hardware. There are
other benefits to oversharding, such as increased parallelism during
indexing and query execution (provided you have the CPU capacity,
which is typically the case on modern hardware).

re 2: mainly depends on how the Java GC and heap are affected by
colocating the cores on the same JVM ... if heap is stable and the GC
is keeping up and qps / latency times are acceptable, I wouldn't
change it.

re 3: read Trey's chapter 14 in Solr in Action ;-)

Cheers,
Tim

On Tue, Jul 8, 2014 at 10:09 PM, Zane Rockenbaugh <zane@navigo.com> wrote:
> I'm working on a product hosted with AWS that uses Elastic Beanstalk
> auto-scaling to good effect and we are trying to set up similar (more or
> less) runtime scaling support with Solr. I think I understand how to set
> this up, and wanted to check I was on the right track.
>
> We currently run 3 cores on a single host / Solr server / shard. This is
> just fine for now, and we have overhead for the near future. However, I
> need to have a plan, and then test, for a higher capacity future.
>
> 1) I gather that if I set up SolrCloud, and then later load increases, I
> can spin up a second host / Solr server, create a new shard, and then split
> the first shard:
>
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3
>
> And doing this, we no longer have to commit to shards out of the gate.
>
> 2) I'm not clear whether there's a big advantage splitting up the cores or
> not. Two of the three cores will have about the same number of documents,
> though only one contains large amounts of text. The third core is much
> smaller in both bytes and documents (2 orders of magnitude).
>
> 3) We are also looking at moving multi-lingual. The current plan is to
> store the localized text in fields within the same core. The languages will
> be added over time. We can update the schema (as each will be optional).
> This seems easier than adding a core for each language. Is there a downside?
>
> Thanks for any pointers.

Mime
View raw message