cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Bialecki <andrew.biale...@klaviyo.com>
Subject Re: determining the cause of a high CPU / disk util node
Date Sun, 03 Sep 2017 15:15:02 GMT
Fay, what do you mean by "partition key data is on one node." Shouldn't a
write request with RF=3 be fulfillable by any of three nodes?

I do think we have a "hot key," we're working on tracking that down.

On Sat, Sep 2, 2017 at 11:30 PM, Fay Hou [Storage Service] ­ <
fayhou@coupang.com> wrote:

> Most likely related to a poor data modeling. The partition key data is on
> one node. Checking into the queries and table design
>
> On Sep 2, 2017 5:48 PM, Andrew Bialecki <andrew.bialecki@klaviyo.com>
> wrote:
>
> We're running Cassandra 3.7 on AWS, different AZs, same region. The
> columns are counters and the workload is 95% writes, but of course those
> involves a local read and write because their coutners.
>
> We have a node with much higher CPU load than others under heavy write
> volume. That node is at 100% disk utilization / high iowait. The IO load
> when looked at with iostat is primarily reads (95%) vs writes in terms of
> requests and bytes. Below's a graph of the CPU.
>
> Any ideas to how we could diagnose what is causing so much IO vs. other
> nodes?
>
> Also, we're not sure why this node in particular is hot the other two
> "replica" nodes (we use RF = 3). We're using the DataStax driver and are
> looking into the load balancing policy to see if that's an issue.
>
> [image: Inline image 1]
>
> --
> Andrew Bialecki
> Klaviyo
>
>
>


-- 
Andrew Bialecki

<https://www.klaviyo.com/>

Mime
View raw message