lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Monica Skidmore <Monica.Skidm...@careerbuilder.com>
Subject Re: Load Balancing between Two Cloud Clusters
Date Mon, 30 Apr 2018 18:03:51 GMT
Thank you, Erick.  That confirms our understanding for a single cluster, or once we select
a node from one of the two clusters to query.

As we try to set up an external load balancer to go between two clusters, though, we still
have some questions.  We need a way to determine that a node is still 'alive' and should be
in the load balancer, and we need a way to know that a new node is now available and fully
ready with its replicas to add to the load balancer.

How does ZooKeeper make this determination?  Does it do something different if multiple collections
are on a single cluster?  And, even with just one cluster, what is best practice for keeping
a current list of active nodes in the cluster, especially for extremely high query rates?

Again, if there's some good documentation on this, I'd love a pointer...

Monica Skidmore
Senior Software Engineer
 

 
On 4/30/18, 1:09 PM, "Erick Erickson" <erickerickson@gmail.com> wrote:

    Multiple clusters with the same dataset aren't load-balanced by Solr,
    you'll have to accomplish that from "outside", e.g. something that sends
    queries to each cluster.
    
    _Within_ a cluster (collection), as long as a request gets to any Solr
    node, sub-requests are distributed with an internal software LB. As far as
    a single collection, you're fine just sending any query to any node. Even
    if you send a query to a node that hosts no replicas for a collection, Solr
    will "do the right thing" and forward it appropiately.
    
    HTH,
    Erick
    
    On Mon, Apr 30, 2018 at 9:46 AM, Monica Skidmore <
    Monica.Skidmore@careerbuilder.com> wrote:
    
    > We are migrating from a master-slave configuration to Solr cloud (7.3) and
    > have questions about the preferred way to load balance between the two
    > clusters.  It looks like we want to use a load balancer that directs
    > queries to any of the server nodes in either cluster, trusting that node to
    > handle the query correctly – true?  If we auto-scale nodes into the
    > cluster, are there considerations about when a node becomes ‘ready’ to
    > query from a Solr perspective and when it is added to the load balancer?
    > Also, what is the preferred method of doing a health-check for the load
    > balancer – would it be “bin/solr healthcheck -c myCollection”?
    >
    >
    >
    > Pointers in the right direction – especially to any documentation on
    > running multiple clusters with the same dataset – would be appreciated.
    >
    >
    >
    > *Monica Skidmore*
    > *Senior Software Engineer*
    >
    >
    >
    > [image: cid:image001.png@01D3A0F1.06327950]
    >
    >
    >
    

Mime
View raw message