cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1
Date Sat, 27 Jun 2015 09:27:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604072#comment-14604072
] 

Robert Stupp commented on CASSANDRA-9619:
-----------------------------------------

The regression for this workload is caused by {{sstable_preemptive_open_interval_in_mb}} being
ignored (hard-coded to {{-1}}) in 2.1.3 and 2.1.4. It is evaluated in versions before and
after these releases.

cstar runs:
* [last "bisect" run|http://cstar.datastax.com/tests/id/8ed4f4c0-1c48-11e5-b36d-42010af0688f]
that [identifies|http://cstar.datastax.com/graph?stats=8ed4f4c0-1c48-11e5-b36d-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.6&ymin=0&ymax=196958.3]
this [commit|https://github.com/apache/cassandra/commit/cf3e748cbf1faaed68870f22a45edc603eb1b4e8].
* [cross check|http://cstar.datastax.com/tests/id/1eee9132-1c4f-11e5-bcd7-42010af0688f] with
[latest 2.1 and 2.1 with that commit reversed|http://cstar.datastax.com/graph?stats=1eee9132-1c4f-11e5-bcd7-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.82&ymin=0&ymax=206758.2]
* [cross check|http://cstar.datastax.com/tests/id/53f35062-1c53-11e5-bcd7-42010af0688f] with
[latest 2.2 and 2.2 with that commit reversed|http://cstar.datastax.com/graph?stats=53f35062-1c53-11e5-bcd7-42010af0688f&metric=op_rate&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=50.27&ymin=0&ymax=195163.1]

That's the good news. Another good news is that a simple {{cassandra.yaml}} change can ”solve”
this regression.

The bad news IMO is that {{sstable_preemptive_open_interval_in_mb}} has some meaning and AFAIK
should give some improvement for ”matching” workloads. Frankly I don't really know what
to do next - whether to let it default to {{-1}}, stick with current default of {{50}}, change
it to something else. IMO some extensive perf testing should be done (again??) to give better
advice for this parameter.
I think, this is also the reason why blade_11 and bdplab gave different results - one has
SSDs and one has spindles - just a guess. For reference, I've started the [2.1-cross-check
on bdplab|http://cstar.datastax.com/tests/id/32532262-1cac-11e5-8031-42010af0688f].

Another bad news is that there seems to be another less big regression when looking at the
numbers of 2.1.4 compared to the current 2.1/2.2 branches with {{sstable_preemptive_open_interval_in_mb=-1}}
or approx. 1.5-4% for both reads and writes. This one is much harder to find - but this one
is likely to be caused by ”pure” code change(s).

Finally I have to admit that we should have at least a daily performance cstar test with some
”standard” workloads (90% writes, 90% reads, 50/50) against current dev branches (2.1,
2.2, trunk) linked in cassci (since that's where we usually look at). These tests don't need
to run for a long time - 2M or 3M keys should be enough to find obvious regressions. For more
”detailed" results we already have extensive tests in place. Beside that, we should run
perf tests before commit for everything that likely affects performance.

> Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9619
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jim Witschey
>              Labels: perfomance
>             Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 and 2.0.
I found it running cstar_perf jobs with 50-column tables. 2.2 may be worse than trunk, though
my results on that aren't consistent. The relevant cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=65000000 -rate threads=300 -col n=FIXED\(50\)
> stress read n=65000000 -rate threads=300
> stress read n=65000000 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
to [the second read operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's slightly worse for
2.2, and 2.2 operations per second fall continuously from the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk and pretty
bad on 2.2. Again, the latency gets higher and higher on 2.2 as the read operations progress
(see the graphs [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
and [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
though in this one trunk performed worse than 2.2. This run also didn't display the increasing
latency in 2.2.
> This regression may show for smaller numbers of columns, but not as prominently, as shown
[in the results to this test with the stress default of 5 columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
There's an increase in latency variability on trunk and 2.2, but I don't see a regression
in summary statistics.
> My measurements aren't confounded by [the recent regression in cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558];
cstar_perf uses the same stress program (from trunk) on all versions on the cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression on 50 columns
but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message