cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua McKenzie (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-3668) Parallel streaming for sstableloader
Date Fri, 04 Apr 2014 21:21:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960435#comment-13960435
] 

Joshua McKenzie edited comment on CASSANDRA-3668 at 4/4/14 9:19 PM:
--------------------------------------------------------------------

A quick update on this - going the route of multiple StreamSessions per StreamPlan is going
to require some restructuring.  The current design assumes a single socket for streaming and
changing to multiple StreamSessions means multiple ConnectionHandlers, all of which assume
ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm working on promoting
IncomingMessageHandler and OutgoingMessageHandler into higher-level abstractions that are
responsible for polling the socket and dispatching to various StreamSessions based on deserialized
session indices on the inbound or following the current PriorityQueue polling mechanism for
the outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even over a single
socket as my prelim parallelized stream testing is peaking at ~ 55MB/s on 5 connections-per-host
vs. 49MB/s on 4 connections - diminishing returns as we get higher.  Compared to the 24MB/s
I'm benchmarking on a single connection it's still a respectable increase.


was (Author: joshuamckenzie):
A quick update on this - going the route of multiple StreamSessions per StreamPlan with the
current architecture is going to require some restructuring.  The current design assumes a
single socket for streaming and multiple StreamSessions means multiple ConnectionHandlers,
all of which assume ownership of polling the readChannel on a socket.

To respect the single-socket-for-streaming paradigm we currently have, I'm working on promoting
IncomingMessageHandler and OutgoingMessageHandler into higher-level abstractions that are
responsible for polling the socket and dispatching to various StreamSessions based on deserialized
session indices on the inbound or following the current PriorityQueue polling mechanism for
the outbound rather than the current paradigm of being owned by a StreamSession.

It doesn't look like we're at risk of a bottleneck on network resources even over a single
socket as my prelim parallelized stream testing is peaking at ~ 55MB/s on 5 connections-per-host
vs. 49MB/s on 4 connections - diminishing returns as we get higher.  Compared to the 24MB/s
I'm benchmarking on a single connection it's still a respectable increase.

> Parallel streaming for sstableloader
> ------------------------------------
>
>                 Key: CASSANDRA-3668
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Manish Zope
>            Assignee: Joshua McKenzie
>            Priority: Minor
>              Labels: streaming
>             Fix For: 2.1 beta2
>
>         Attachments: 3668-1.1-v2.txt, 3668-1.1.txt, 3688-reply_before_closing_writer.txt,
sstable-loader performance.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> One of my colleague had reported the bug regarding the degraded performance of the sstable
generator and sstable loader.
> ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
> As stated in above issue generator performance is rectified but performance of the sstableloader
is still an issue.
> 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem
with sstableloader still exists.
> So opening other issue so that sstbleloader problem should not go unnoticed.
> FYI : We have tested the generator part with the patch given in 3589.Its Working fine.
> Please let us know if you guys require further inputs from our side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message