cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9304) COPY TO improvements
Date Mon, 26 Oct 2015 07:37:27 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973861#comment-14973861
] 

Stefania commented on CASSANDRA-9304:
-------------------------------------

Here are the results for 2M records, stress generated, with only 1 node running locally on
my box (i7-4600U CPU @ 2.10GHz quad-core, 7652MB, SSD):

|cassandra-unloader|12 seconds|
|9304-2.1 branch|35 seconds|
|dkua/9304|2 minutes, 53 seconds|
|cassandra-2.2 branch|6 minutes, 28 seconds|

Notes: 

* on the cassandra-2.1 branch, COPY TO is currently broken, that's why I used the 2.2 branch
* the results of the latest 9304-2.1 vary according to number of threads and page size selected,
not sure if we can do still better (without fixing the byte array formatting bug discussed
above, we were at 3 minutes 45 seconds, worse than the original implementation which had a
simpler but perhaps more effective job scheduling policy)


> COPY TO improvements
> --------------------
>
>                 Key: CASSANDRA-9304
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Stefania
>            Priority: Minor
>              Labels: cqlsh
>             Fix For: 3.x, 2.1.x, 2.2.x
>
>
> COPY FROM has gotten a lot of love.  COPY TO not so much.  One obvious improvement could
be to parallelize reading and writing (write one page of data while fetching the next).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message