lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: SolrCloud Shard + Replica on Multiple servers with SolrCloud
Date Tue, 01 Dec 2015 15:11:55 GMT
Answers inline

On Tue, Dec 1, 2015, at 06:03 AM, Adrian Liew wrote:
> Hi all,
> 
> Will really like to seek anyone's opinion on my query below. Desperate to
> know if this is possible or if someone is keen to share their thought
> experience.
> 
> Best regards,
> Adrian
> 
> 
> -----Original Message-----
> From: Adrian Liew [mailto:adrian.liew@avanade.com] 
> Sent: Saturday, November 28, 2015 10:38 AM
> To: solr-user@lucene.apache.org
> Subject: RE: SolrCloud Shard + Replica on Multiple servers with SolrCloud
> 
> Hi Upaya,
> 
> I am trying to setup a 3 shard 3 server setup with a replication factor
> of 2 with SolrCloud 5.3.0.
> 
> In particular trying to follow this setup described in this blog:
> http://lucidworks.com/blog/2014/06/03/introducing-the-solr-scale-toolkit/
> 
> Correction to description below:
> 
> EC2 Instance 1
> 
> Shard 1 - Leader  (port 8984 separate drive with 50 GB SSD) Shard 2 -
> Leader  (port 8985 separate drive with 50 GB SSD) - Leader (port 8986
> separate drive with 50 GB SSD)
> 
> EC2 Instance 2
> 
> Shard 1 - Replica (port 8984 separate drive with 50 GB SSD) Shard 2 -
> Replica (port 8985 separate drive with 50 GB SSD) - Replica (port 8986
> separate drive with 50 GB SSD)
> 
> EC2 Instance 3
> 
> Shard 1 - Replica (port 8984 separate drive with 50 GB SSD) Shard 2 -
> Replica (port 8985 separate drive with 50 GB SSD) - Replica (port 8986
> separate drive with 50 GB SSD)
> 
> To your questions
> 
> >>  Why are you running multiple instances on the same host? 
> This was the architecture best practice provided by Lucidworks. For more
> info, you can visit this site,
> http://lucidworks.com/blog/2014/06/03/introducing-the-solr-scale-toolkit/

It seems they use two instances on a node because nodes get two free
40Gb SSD drives. Beyond that, they don't describe the reasoning.

> >> You can host your two replicas inside the same Solr instance.
> I reckon because this avoids the probability of a single shard (its
> leader and replicas) going down in one hit. What happens if on node that
> holds one shard goes down altogether? You will lose a chunk of your
> index. The architecture I mentioned above prevents that from happening. I
> will want my shards to be spread out for HA.

There is no real point hosting two replicas of the same shard on the
same node. Other than that, I'm not sure I see a huge benefit (beyond
the SSD one) of having multiple instances per node.

> >> Also, you should not concern yourself (too much) with which node is the leader
as that can change through time.
> I am not concerned as I know this setup will guarantee a leader is in
> place for each shard for fault tolerance situation.

Okay.

> >> How have you come to the conclusion that you need to shard?
> I am preparing a use case for my customer. Haven't arrived yet as to when
> to shard. But I need to setup a demo to show to my customer. I am
> proposing this as an architecture for the long term to them.

Okay.
 
> > As I know there are two approaches to sharding that is "Custom Sharding"
> > and "Automatic Sharding". Which approach suits the use case described 
> > above?
> Do you know this answer?

Generally, I would use the inbuilt sharding functionality, unless you
come up with a good reason why it doesn't work for you.
 
> Do you also have your own opinion on setting up a 3 shard 3 server
> cluster? 

I guess, if you have three shards, then you want one shard per server
obviously. You could just have a replica of each shard on each of your
servers, that way you have 9 cores in total, three per node.  But that
wouldn't make straight-forward use of your to SSDs per instance.

Upayavira

Mime
View raw message