cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests
Date Sat, 30 Jul 2016 02:48:20 GMT


Stefania commented on CASSANDRA-11521:

The patch is ready for review:


There are also the [driver patch|] and
the [spark connector patch|].
For these I plan to create tickets for the respective projects once the native protocol changes
have been finalized.

A [design document|]
is also available.

The Spark benchmark results are available in [this comment|]
on the parent ticket. The final patch is slightly better than the proof-of-concept, and the
asynchronous paging mechanism significantly outperforms the existing mechanism for large data

I've also repeated some cstar_perf tests to rule out performance regressions with ordinary
queries, which are not in the optimized path:

* Single partition queries (default cassandra-stress read command) at CL.LOCAL_ONE (the cassandra-stress
default): [first run|],
[second run with swapped revision's order|],
[an old run|]
done before enabling token aware routing in cassandra stress.

* Single partition queries at CL.ALL: [unique run|]

There is a gap of 3.6K ops/second without token aware routing and 1K with CL=ALL. With token
aware routing the patch is instead 1K ops / second faster. These differences must arise from
the refactoring in select statement. They are very small differences, the test error seems
to be around 0.5K, but I can look into it further if there are concerns. 

> Implement streaming for bulk read requests
> ------------------------------------------
>                 Key: CASSANDRA-11521
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Local Write-Read Paths
>            Reporter: Stefania
>            Assignee: Stefania
>              Labels: client-impacting, protocolv5
>             Fix For: 3.x
>         Attachments:
> Allow clients to stream data from a C* host, bypassing the coordination layer and eliminating
the need to query individual pages one by one.

This message was sent by Atlassian JIRA

View raw message