cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6146) CQL-native stress
Date Fri, 04 Oct 2013 15:11:42 GMT


Jonathan Ellis commented on CASSANDRA-6146:

What I'd like to see is a drastic reduction in the amount of flags we support, in favor of
allowing the user to pre-create a table for stress-ng (stress-cql?) to take its cues from.

So here's what our new Config might look like:

        availableOptions.addOption("h", "help", false, "Show this help message and exit");
        // NB only SELECT makes sense for compound PK unless we add some kind of scan-for-PK
        availableOptions.addOption("cql", "cql", true, "CQL to execute for each operation.
Use ? for partition key bind placeholder");
        availableOptions.addOption("d", "distribution", true, "Partition key distribution:
uniform or gaussian.  Default: uniform");
        availableOptions.addOption("ks", "keyspace", true, "Keyspace. Default: stress");
        availableOptions.addOption("n", "nodes", true, "Nodes to connect to (CDL). Default:");
        availableOptions.addOption("p", "partitions", true, "Number of distinct partitions
to use.  Default: 1,000,000");
        availableOptions.addOption("pop", "populate", false, "Populate mode. Enable to generate
random inserts for the given table");
        availableOptions.addOption("r", "requests", true, "Number of requests to execute.
 Default: 1,000,000");
        availableOptions.addOption("std", "stdev", true, "Standard deviation from mean, for
gaussian distribution only. Default: 0.1");
        availableOptions.addOption("t", "table", true, "Table. Default: data");

So, you'd have command lines like this:

# {{stress -cql "SELECT * FROM data WHERE key = ?"}}
# {{stress -cql "SELECT username, password FROM users WHERE user_id = ?"}}
# {{stress -cql "SELECT collected_at, value FROM timeseries WHERE sensor_id = ? LIMIT 100"}}
# {{stress -cql "SELECT * FROM timeseries WHERE sensor_id = ? AND collected_at = ?"}}
# {{stress --populate}}
# {stress --populate --table timeseries}}

There's some asymmetry between inserts and reads; I'm not sure it makes sense to customize
INSERT all that much, and I want people to be able to get a quick smoke test up with a minimum
of ceremony, i.e., creating a default {{data}} table for them rather than requiring explicit
{{CREATE TABLE}} first.  But, if you want to create a custom table, we should be able to introspect
it and populate it for you.

The populate code might look something like this:

    private static void populate(Config config, Session session)
        KeyspaceMetadata ks = session.getCluster().getMetadata().getKeyspace(config.keyspace);
        TableMetadata table = ks.getTable(config.table);
        if (table == null)
            System.out.println("NOTICE: Creating table with 6 int columns.  Create manually
if you prefer otherwise.");
            session.execute("CREATE TABLE " + config.table + " (key int PRIMARY KEY, i1 int,
i2 int, i3 int, i4 int, i5 int");
        List<ColumnMetadata> pkColumns = table.getPrimaryKey();
        List<ColumnMetadata> columns = table.getColumns();

        String cql = "INSERT INTO " + config.table + " VALUES (";
        for (int i = 0; i < columns.size(); i++)
            ColumnMetadata c = columns.get(i);
            if (i > 0)
                cql += ",";
            cql += c.getName();
        cql += ")";
        PreparedStatement statement = session.prepare(cql);

        for (int n = 0; n < config.requests; n++)
            BoundStatement bs = new BoundStatement(statement);

            // partition key gets treated by distribution
            if (config.distribution == Config.Distribution.UNIFORM)
                if (config.partitions == config.requests)
                    bs.setInt(0, n);
                    bs.setInt(0, random.nextInt(config.partitions));
                int k;
                while (true)
                    // loop until we get a result within the necessary bounds
                    k = (int) (config.mean + (random.nextGaussian() + config.sigma));
                    if (k >= 0 && k < config.partitions)
                bs.setInt(0, k);

            // non-partition key columns get random data
            for (int i = 1; i < columns.size(); i++)
                ColumnMetadata c = columns.get(i);
                if (c.getType() == DataType.cint())
                    bs.setInt(i, random.nextInt());
                    throw new UnsupportedOperationException("Flesh this out with support for
more types");

            executeLimitedAsync(session, bs);

    private static void executeLimitedAsync(Session session, BoundStatement statement)
        while (executing.size() == MAX_EXECUTING)
            for (Iterator<ResultSetFuture> iter = executing.iterator(); iter.hasNext();
                ResultSetFuture future =;
                if (future.isDone())
            Uninterruptibles.sleepUninterruptibly(1, TimeUnit.MILLISECONDS);


> CQL-native stress
> -----------------
>                 Key: CASSANDRA-6146
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
> The existing CQL "support" in stress is not worth discussing.  We need to start over,
and we might as well kill two birds with one stone and move to the native protocol while we're
at it.

This message was sent by Atlassian JIRA

View raw message