cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9619) Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1
Date Fri, 26 Jun 2015 13:58:05 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

T Jake Luciani updated CASSANDRA-9619:
--------------------------------------
    Assignee:     (was: T Jake Luciani)

> Read performance regression in tables with many columns on trunk and 2.2 vs. 2.1
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-9619
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9619
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jim Witschey
>              Labels: perfomance
>             Fix For: 2.2.0 rc2
>
>
> There seems to be a regression in read in 2.2 and trunk, as compared to 2.1 and 2.0.
I found it running cstar_perf jobs with 50-column tables. 2.2 may be worse than trunk, though
my results on that aren't consistent. The relevant cstar_perf jobs are here:
> http://cstar.datastax.com/tests/id/273e2ea8-0fc8-11e5-816c-42010af0688f
> http://cstar.datastax.com/tests/id/3a8002d6-1480-11e5-97ff-42010af0688f
> http://cstar.datastax.com/tests/id/40ff2766-1248-11e5-bac8-42010af0688f
> The sequence of commands for these jobs is
> {code}
> stress write n=65000000 -rate threads=300 -col n=FIXED\(50\)
> stress read n=65000000 -rate threads=300
> stress read n=65000000 -rate threads=300
> {code}
> Have a look at the operations per second going from [the first read operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7]
to [the second read operation|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=174379.7].
They've fallen from ~135K to ~100K comparing trunk to 2.1 and 2.0. It's slightly worse for
2.2, and 2.2 operations per second fall continuously from the first to the second read operation.
> There's a corresponding increase in read latency -- it's noticable on trunk and pretty
bad on 2.2. Again, the latency gets higher and higher on 2.2 as the read operations progress
(see the graphs [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=729.08&ymin=0&ymax=17.27]
and [here|http://cstar.datastax.com/graph?stats=273e2ea8-0fc8-11e5-816c-42010af0688f&metric=95th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=928.62&ymin=0&ymax=14.52]).
> I see a similar regression in a [more recent test|http://cstar.datastax.com/graph?stats=40ff2766-1248-11e5-bac8-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=752.62&ymin=0&ymax=171799.1],
though in this one trunk performed worse than 2.2. This run also didn't display the increasing
latency in 2.2.
> This regression may show for smaller numbers of columns, but not as prominently, as shown
[in the results to this test with the stress default of 5 columns|http://cstar.datastax.com/graph?stats=227cb89e-0fc8-11e5-9f14-42010af0688f&metric=99.9th_latency&operation=3_read&smoothing=1&show_aggregates=true&xmin=0&xmax=498.19&ymin=0&ymax=334.29].
There's an increase in latency variability on trunk and 2.2, but I don't see a regression
in summary statistics.
> My measurements aren't confounded by [the recent regression in cassandra-stress|https://issues.apache.org/jira/browse/CASSANDRA-9558];
cstar_perf uses the same stress program (from trunk) on all versions on the cluster.
> I'm currently working to
> - reproduce with a smaller workload so this is easier to bisect and debug.
> - get results with larger numbers of columns, since we've seen the regression on 50 columns
but not the stress default of 5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message