Ok, thanks Richard. That's good to hear.
However, I still contend that as node count increases to infinity, the
probability of there being at least two node failures in the cluster at any
time would increase to 100%.
I think of this as somewhat analogous to RAID -- I would not be comfortable
with a 144+ disk RAID 6 array, no matter the rebuild speed :)
Is it possible to configure or write a snitch that would create separate
distribution zones within the cluster? (e.g. 144 nodes in cluster, split
into 12 zones. Data stored to node 1 could only be replicated to one of 11
other nodes in the same distribution zone).
On Tue, Dec 11, 2012 at 3:24 AM, Richard Low wrote:
> Hi Eric,
> The time to recover one node is limited by that node, but the time to
> recover that's most important is just the time to replicate the data that
> is missing from that node. This is the removetoken operation (called
> removenode in 1.2), and this gets faster the more nodes you have.
> On 11 December 2012 08:39, Eric Parusel wrote:
>
>> Thanks for your thoughts guys.
>> I agree that with vnodes total downtime is lessened. Although it also
>> seems that the total number of outages (however small) would be greater.
>>
>> But I think downtime is only lessened up to a certain cluster size.
>>
>> I'm thinking that as the cluster continues to grow:
>> - node rebuild time will max out (a single node only has so much write
>> bandwidth)
>> - the probability of 2 nodes being down at any given time will continue
>> to increase -- even if you consider only non-correlated failures.
>>
>> Therefore, when adding nodes beyond the point where node rebuild time
>> maxes out, both the total number of outages *and* overall downtime would
>> increase?
>>
>> On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo wrote:
>>
>>> Assuming you need to work with quorum in a non-vnode scenario. That
>>> means that if 2 nodes in a row in the ring are down some number of quorum
>>> operations will fail with UnavailableException (TimeoutException right
>>> after the failures). This is because the for a given range of tokens quorum
>>> will be impossible, but quorum will be possible for others.
>>> In a vnode world if any two nodes are down, then the intersection of
>>> vnode token ranges they have are unavailable.
>>>
>>> I think it is two sides of the same coin.
>>>
>>> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low wrote:
>>>
>>>> Hi Tyler,
>>>>
>>>> You're right, the math does assume independence which is unlikely to be
>>>> accurate. But if you do have correlated failure modes e.g. same power,
>>>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>>>> features to ensure replicas are spread around so your cluster can survive
>>>> the correlated failure mode. So I would expect vnodes to improve uptime in
>>>> all scenarios, but haven't done the math to prove it.
>>>> Richard.
>>>>
