cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3
Date Tue, 05 May 2015 20:58:00 GMT


Tyler Hobbs commented on CASSANDRA-9302:

We don't necessarily get TAR with the python driver because its [implementation of murmur3|]
is in a C extension, and we don't compile that with the bundled driver.  Compiling it is not
really a feasible option, so we would need to port the hash to pure python.

For prepared statements, the main chunk of work is implementing {{from_string()}} for every
type so that we can properly serialize the data.  Batching by partition key can be done with
or without prepared statements, so we may want to experiment with that first.

> Optimize cqlsh COPY FROM, part 3
> --------------------------------
>                 Key: CASSANDRA-9302
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Jonathan Ellis
>             Fix For: 2.1.x
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, but people
need a good bulk load tool now.  One option is to add a separate Java bulk load tool (CASSANDRA-9048),
but if we can match that performance from cqlsh I would prefer to leave COPY FROM as the preferred
option to which we point people, rather than adding more tools that need to be supported indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and CASSANDRA-8225.

This message was sent by Atlassian JIRA

View raw message