cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8457) nio MessagingService
Date Fri, 02 Jan 2015 18:09:35 GMT


Ariel Weisberg commented on CASSANDRA-8457:

i can't get performance counters for cache behaviors on EC2 as far as I can tell and I don't
have a good answer for why I get the performance numbers I am seeing.

I ran the measurements with CL.QUORUM, ONE, and ALL against trunk and my branch with/without
rpc_max_threads increased to 1024.

This was prompted by measurements on a 15 node cluster where CL.ONE was 10x faster then CL.ALL.
I measured the full matrix on a 9 node cluster and CL.ONE was 5x faster than CL.ALL which
with RF=5 is the expected performance delta. I definitely see under utilization. With CL.ONE
run right at 1600% and with CL.ALL they don't make it up that high although trunk does better
in that respect.

The under utilization is worse with the modified code that uses SEPExecutor. I maybe have
to run with 15 nodes again to see if the jump from 9-15 is what causes CL.ALL to perform worse
or if the difference is that I was using a placement group and 14.04 in the 9 node cluster.

The change to use SEPExecutor for writes was slightly slower to a lot slower in QUORUM and
ALL cases at 9 nodes. I think that is a dead end, but I do wonder if that is because SEPExecutor
might not have the same cache friendly behavior that running dedicated threads does. Dedicated
threads require signaling and context switching, but thread scheduling policies could result
in threads servicing each socket alway running in the same spot.

I am going to try again with netty. I should at least be able to match the performance of
trunk with a non-blocking approach so I think it is still worth digging.

> nio MessagingService
> --------------------
>                 Key: CASSANDRA-8457
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.0
> Thread-per-peer (actually two each incoming and outbound) is a big contributor to context
switching, especially for larger clusters.  Let's look at switching to nio, possibly via Netty.

This message was sent by Atlassian JIRA

View raw message