cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6127) vnodes don't scale to hundreds of nodes
Date Tue, 12 Nov 2013 23:45:19 GMT


Jonathan Ellis commented on CASSANDRA-6127:

bq. ISTM that FD processing Gossip updates synchronously is a fundamental problem. Any hiccup
in processing will cause FD false positives.

I've pulled a fix for this out to CASSANDRA-6338.

> vnodes don't scale to hundreds of nodes
> ---------------------------------------
>                 Key: CASSANDRA-6127
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Any cluster that has vnodes and consists of hundreds of physical
>            Reporter: Tupshin Harper
>            Assignee: Jonathan Ellis
>         Attachments: 2013-11-05_18-04-03_no_compression_cpu_time.png, 2013-11-05_18-09-38_compression_on_cpu_time.png,
6000vnodes.patch, AdjustableGossipPeriod.patch, cpu-vs-token-graph.png, delayEstimatorUntilStatisticallyValid.patch,
> There are a lot of gossip-related issues related to very wide clusters that also have
vnodes enabled. Let's use this ticket as a master in case there are sub-tickets.
> The most obvious symptom I've seen is with 1000 nodes in EC2 with m1.xlarge instances.
Each node configured with 32 vnodes.
> Without vnodes, cluster spins up fine and is ready to handle requests within 30 minutes
or less. 
> With vnodes, nodes are reporting constant up/down flapping messages with no external
load on the cluster. After a couple of hours, they were still flapping, had very high cpu
load, and the cluster never looked like it was going to stabilize or be useful for traffic.

This message was sent by Atlassian JIRA

View raw message