cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Haddad <...@jonhaddad.com>
Subject Re: Cassandra Rack - Datacenter Load Balancing relations
Date Wed, 23 Oct 2019 21:12:15 GMT
Oh, my bad.  There was a flood of information there, I didn't realize you
had switched to two DCs.  It's been a long day.

I'll be honest, it's really hard to read your various options as you've
intermixed terminology from AWS and Cassandra in a weird way and there's
several pages of information here to go through.  I don't have time to
decipher it, sorry.

Spread a DC across 3 AZs if you want to be fault tolerant and will use
RF=3, use a single AZ if you don't care about full DC failure in the case
of an AZ failure or you're not using RF=3.


On Wed, Oct 23, 2019 at 4:56 PM Sergio <lapostadisergio@gmail.com> wrote:

> OPTION C or OPTION A?
>
> Which one are you referring to?
>
> Both have separate DCs to keep the workload separate.
>
>    - OPTION A)
>    - Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>    - 3 read ONE us-east-1a
>    - 4 write TWO us-east-1b 5 write TWO us-east-1b
>    - 6 write TWO us-east-1b
>
>
> Here we have 2 DC read and write
> One Rack per DC
> One Availability Zone per DC
>
> Thanks,
>
> Sergio
>
>
> On Wed, Oct 23, 2019, 1:11 PM Jon Haddad <jon@jonhaddad.com> wrote:
>
>> Personally, I wouldn't ever do this.  I recommend separate DCs if you
>> want to keep workloads separate.
>>
>> On Wed, Oct 23, 2019 at 4:06 PM Sergio <lapostadisergio@gmail.com> wrote:
>>
>>>           I forgot to comment for
>>>
>>>    OPTION C)
>>>    1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>    2. 3 read ONE us-east-1c
>>>    3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>    4. 6 write TWO us-east-1c I would expect that I need to decrease the
>>>    Consistency Level in the reads if one of the AZ goes down. Please consider
>>>    the below one as the real OPTION A. The previous one looks to be wrong
>>>    because the same rack is assigned to 2 different DC.
>>>    5. OPTION A)
>>>    6. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>    7. 3 read ONE us-east-1a
>>>    8. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>    9. 6 write TWO us-east-1b
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Sergio
>>>
>>> Il giorno mer 23 ott 2019 alle ore 12:33 Sergio <
>>> lapostadisergio@gmail.com> ha scritto:
>>>
>>>> Hi Reid,
>>>>
>>>> Thank you very much for clearing these concepts for me.
>>>> https://community.datastax.com/comments/1133/view.html I posted this
>>>> question on the datastax forum regarding our cluster that it is unbalanced
>>>> and the reply was related that the *number of racks should be a
>>>> multiplier of the replication factor *in order to be balanced or 1. I
>>>> thought then if I have 3 availability zones I should have 3 racks for each
>>>> datacenter and not 2 (us-east-1b, us-east-1a) as I have right now or in the
>>>> easiest way, I should have a rack for each datacenter.
>>>>
>>>>
>>>>
>>>>    1. Datacenter: live
>>>>    ================
>>>>    Status=Up/Down
>>>>    |/ State=Normal/Leaving/Joining/Moving
>>>>    --  Address      Load       Tokens       Owns    Host ID
>>>>                        Rack
>>>>    UN  10.1.20.49   289.75 GiB  256          ?
>>>>    be5a0193-56e7-4d42-8cc8-5d2141ab4872  us-east-1a
>>>>    UN  10.1.30.112  103.03 GiB  256          ?
>>>>    e5108a8e-cc2f-4914-a86e-fccf770e3f0f  us-east-1b
>>>>    UN  10.1.19.163  129.61 GiB  256          ?
>>>>    3c2efdda-8dd4-4f08-b991-9aff062a5388  us-east-1a
>>>>    UN  10.1.26.181  145.28 GiB  256          ?
>>>>    0a8f07ba-a129-42b0-b73a-df649bd076ef  us-east-1b
>>>>    UN  10.1.17.213  149.04 GiB  256          ?
>>>>    71563e86-b2ae-4d2c-91c5-49aa08386f67  us-east-1a
>>>>    DN  10.1.19.198  52.41 GiB  256          ?
>>>>    613b43c0-0688-4b86-994c-dc772b6fb8d2  us-east-1b
>>>>    UN  10.1.31.60   195.17 GiB  256          ?
>>>>    3647fcca-688a-4851-ab15-df36819910f4  us-east-1b
>>>>    UN  10.1.25.206  100.67 GiB  256          ?
>>>>    f43532ad-7d2e-4480-a9ce-2529b47f823d  us-east-1b
>>>>    So each rack label right now matches the availability zone and we
>>>>    have 3 Datacenters and 2 Availability Zone with 2 racks per DC but the
>>>>    above is clearly unbalanced
>>>>    If I have a keyspace with a replication factor = 3 and I want to
>>>>    minimize the number of nodes to scale up and down the cluster and keep
it
>>>>    balanced should I consider an approach like OPTION A)
>>>>    2. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    3. 3 read ONE us-east-1a
>>>>    4. 4 write ONE us-east-1b 5 write ONE us-east-1b
>>>>    5. 6 write ONE us-east-1b
>>>>    6. OPTION B)
>>>>    7. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1a
>>>>    8. 3 read ONE us-east-1a
>>>>    9. 4 write TWO us-east-1b 5 write TWO us-east-1b
>>>>    10. 6 write TWO us-east-1b
>>>>    11. *7 read ONE us-east-1c 8 write TWO us-east-1c*
>>>>    12. *9 read ONE us-east-1c* Option B looks to be unbalanced and I
>>>>    would exclude it OPTION C)
>>>>    13. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>    14. 3 read ONE us-east-1c
>>>>    15. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>    16. 6 write TWO us-east-1c
>>>>    17.
>>>>
>>>>
>>>>    so I am thinking of A if I have the restriction of 2 AZ but I guess
>>>>    that option C would be the best. If I have to add another DC for reads
>>>>    because we want to assign a new DC for each new microservice it would
look
>>>>    like:
>>>>       OPTION EXTRA DC For Reads
>>>>       1. Node DC RACK AZ 1 read ONE us-east-1a 2 read ONE us-east-1b
>>>>       2. 3 read ONE us-east-1c
>>>>       3. 4 write TWO us-east-1a 5 write TWO us-east-1b
>>>>       4. 6 write TWO us-east-1c 7 extra-read THREE us-east-1a
>>>>       5. 8 extra-read THREE us-east-1b
>>>>       6.
>>>>          7.
>>>>
>>>>
>>>>    1. 9 extra-read THREE us-east-1c
>>>>       2.
>>>>    The DC for *write* will replicate the data in the other
>>>>    datacenters. My scope is to keep the *read* machines dedicated to
>>>>    serve reads and *write* machines to serve writes. Cassandra will
>>>>    handle the replication for me. Is there any other option that is I missing
>>>>    or wrong assumption? I am thinking that I will write a blog post about
all
>>>>    my learnings so far, thank you very much for the replies Best, Sergio
>>>>
>>>>
>>>> Il giorno mer 23 ott 2019 alle ore 10:57 Reid Pinchback <
>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>
>>>>> No, that’s not correct.  The point of racks is to help you distribute
>>>>> the replicas, not further-replicate the replicas.  Data centers are what
do
>>>>> the latter.  So for example, if you wanted to be able to ensure that
you
>>>>> always had quorum if an AZ went down, then you could have two DCs where
one
>>>>> was in each AZ, and use one rack in each DC.  In your situation I think
I’d
>>>>> be more tempted to consider that.  Then if an AZ went away, you could
fail
>>>>> over your traffic to the remaining DC and still be perfectly fine.
>>>>>
>>>>>
>>>>>
>>>>> For background on replicas vs racks, I believe the information you
>>>>> want is under the heading ‘NetworkTopologyStrategy’ at:
>>>>>
>>>>> http://cassandra.apache.org/doc/latest/architecture/dynamo.html
>>>>>
>>>>>
>>>>>
>>>>> That should help you better understand how replicas distribute.
>>>>>
>>>>>
>>>>>
>>>>> As mentioned before, while you can choose to do the reads in one DC,
>>>>> except for concerns about contention related to network traffic and
>>>>> connection handling, you can’t isolate reads from writes.  You can
_
>>>>> *mostly*_ insulate the write DC from the activity within the read DC,
>>>>> and even that isn’t an absolute because of repairs.  However, your
mileage
>>>>> may vary, so do what makes sense for your usage pattern.
>>>>>
>>>>>
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>>
>>>>> *From: *Sergio <lapostadisergio@gmail.com>
>>>>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>> *Date: *Wednesday, October 23, 2019 at 12:50 PM
>>>>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>> *Subject: *Re: Cassandra Rack - Datacenter Load Balancing relations
>>>>>
>>>>>
>>>>>
>>>>> *Message from External Sender*
>>>>>
>>>>> Hi Reid,
>>>>>
>>>>> Thanks for your reply. I really appreciate your explanation.
>>>>>
>>>>> We are in AWS and we are using right now 2 Availability Zone and not
>>>>> 3. We found our cluster really unbalanced because the keyspace has a
>>>>> replication factor = 3 and the number of racks is 2 with 2 datacenters.
>>>>> We want the writes spread across all the nodes but we wanted the reads
>>>>> isolated from the writes to keep the load on that node low and to be
able
>>>>> to identify problems in the consumers (reads) or producers (writes)
>>>>> applications.
>>>>> It looks like that each rack contains an entire copy of the data so
>>>>> this would lead to replicate for each rack and then for each node the
>>>>> information. If I am correct if we have  a keyspace with 100GB and
>>>>> Replication Factor = 3 and RACKS = 3 => 100 * 3 * 3 = 900GB
>>>>> If I had only one rack across 2 or even 3 availability zone I would
>>>>> save in space and I would have 300GB only. Please correct me if I am
wrong.
>>>>>
>>>>> Best,
>>>>>
>>>>> Sergio
>>>>>
>>>>>
>>>>>
>>>>> Il giorno mer 23 ott 2019 alle ore 09:21 Reid Pinchback <
>>>>> rpinchback@tripadvisor.com> ha scritto:
>>>>>
>>>>> Datacenters and racks are different concepts.  While they don't have
>>>>> to be associated with their historical meanings, the historical meanings
>>>>> probably provide a helpful model for understanding what you want from
them.
>>>>>
>>>>> When companies own their own physical servers and have them housed
>>>>> somewhere, the questions arise on where you want to locate any particular
>>>>> server.  It's a balancing act on things like network speed of related
>>>>> servers being able to talk to each other, versus fault-tolerance of having
>>>>> many servers not all exposed to the same risks.
>>>>>
>>>>> "Same rack" in that physical world tended to mean something like "all
>>>>> behind the same network switch and all sharing the same power bus". 
The
>>>>> morning after an electrical glitch fries a power bus and thus everything
in
>>>>> that rack, you realize you wished you didn't have so many of the same
type
>>>>> of server together.  Well, they were servers.  Now they are door stops.
>>>>> Badness and sadness.
>>>>>
>>>>> That's kind of the mindset to have in mind with racks in Cassandra.
>>>>> It's an artifact for you to separate servers into pools so that the
>>>>> disparate pools have hopefully somewhat independent infrastructure risks.
>>>>> However, all those servers are still doing the same kind of work, are
the
>>>>> same version, etc.
>>>>>
>>>>> Datacenters are amalgams of those racks, and how similar or different
>>>>> they are from each other depends on what you want to do with them.  What
is
>>>>> true is that if you have N datacenters, each one of them must have enough
>>>>> disk storage to house all the data.  The actual physical footprint of
that
>>>>> data in each DC depends on the replication factors in play.
>>>>>
>>>>> Note that you sorta can't have "one datacenter for writes" because the
>>>>> writes will replicate across the data centers.  You could definitely
choose
>>>>> to have only one that takes read queries, but best to think of writing
as
>>>>> being universal.  One scenario you can have is where the DC not taking
live
>>>>> traffic read queries is the one you use for maintenance or performance
>>>>> testing or version upgrades.
>>>>>
>>>>> One rack makes your life easier if you don't have a reason for
>>>>> multiple racks. It depends on the environment you deploy into and your
>>>>> fault tolerance goals.  If you were in AWS and wanting to spread risk
>>>>> across availability zones, then you would likely have as many racks as
AZs
>>>>> you choose to be in, because that's really the point of using multiple
AZs.
>>>>>
>>>>> R
>>>>>
>>>>>
>>>>> On 10/23/19, 4:06 AM, "Sergio Bilello" <lapostadisergio@gmail.com>
>>>>> wrote:
>>>>>
>>>>>      Message from External Sender
>>>>>
>>>>>     Hello guys!
>>>>>
>>>>>     I was reading about
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
>>>>>
>>>>>     I would like to understand a concept related to the node load
>>>>> balancing.
>>>>>
>>>>>     I know that Jon recommends Vnodes = 4 but right now I found a
>>>>> cluster with vnodes = 256 replication factor = 3 and 2 racks. This is
>>>>> unbalanced because the racks are not a multiplier of the replication
factor.
>>>>>
>>>>>     However, my plan is to move all the nodes in a single rack to
>>>>> eventually scale up and down the node in the cluster once at the time.
>>>>>
>>>>>     If I had 3 racks and I would like to keep the things balanced I
>>>>> should scale up 3 nodes at the time one for each rack.
>>>>>
>>>>>     If I would have 3 racks, should I have also 3 different
>>>>> datacenters so one datacenter for each rack?
>>>>>
>>>>>     Can I have 2 datacenters and 3 racks? If this is possible one
>>>>> datacenter would have more nodes than the others? Could it be a problem?
>>>>>
>>>>>     I am thinking to split my cluster in one datacenter for reads and
>>>>> one for writes and keep all the nodes in the same rack so I can scale
up
>>>>> once node at the time.
>>>>>
>>>>>
>>>>>
>>>>>     Please correct me if I am wrong
>>>>>
>>>>>
>>>>>
>>>>>     Thanks,
>>>>>
>>>>>
>>>>>
>>>>>     Sergio
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>
>>>>>     To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>>>>
>>>>>     For additional commands, e-mail: user-help@cassandra.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>

Mime
View raw message