cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-7980) cassandra-stress should support partial clustering column generation
Date Wed, 02 Dec 2015 09:40:11 GMT


Branimir Lambov updated CASSANDRA-7980:
    Component/s: Testing

> cassandra-stress should support partial clustering column generation
> --------------------------------------------------------------------
>                 Key: CASSANDRA-7980
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Testing
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
> cassandra-stress generates its data randomly, in tiers, so that we can scroll through
the partitions it generates without having to generate their entirety. The problem is that
to support very large partitions (important for benchmarking certain cases, and acceptance
testing) we have to have a large number of clustering columns - generally more than we would
otherwise have, which changes the performance characteristics. We should effectively split
each clustering column into a number of byte-ranges that become tiers for visitation. The
only real complexity here is in obeying the size/count distribution range specified, which
would be difficult for exponential distributions, however we could require the user specify
the ranges, and distributions for each range, upfront. We could even treat them exactly like
other column specifications, but as sub-specs within a given column in the yaml. Or, we could
simply accept that we imperfectly follow the distribution in these situations.

This message was sent by Atlassian JIRA

View raw message