cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12229) Move streaming to non-blocking IO and netty (streaming 2.1)
Date Wed, 31 May 2017 23:26:04 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032195#comment-16032195
] 

Jason Brown commented on CASSANDRA-12229:
-----------------------------------------

Pushed more changes onto the shame branch (sha {{e3dd956fadb8084babdb2d96e222a3b2949589bd}})

- removed waitUntilWritable() from {{ByteBufDataOutputStreamPlus}} and replaced with a simple
semaphore.
- cleaned up the close/cancel notification in {{StreamingInboundHander}} from the {{thread#interrupt()}}
to using an atomic variable (polling it from the background thread).
- corrected and optimized the way ByteBufs and ByteBuffers are used in the stream classes
- restore CASSANDRA-7585 and bring back StreamingConnectionFactory, with modified versions
of it's implementations and then inject the factory where we used to do  it.
- move stream versioning back into StreamMessage, instead of StreamSession (where I had moved
it)
- minor clean ups and doc additions
- addressed TODO's and error handling

Here are a few responses to some of the concerns raised by [~aweisberg] from the current PR.

bq. This is really a generic inbound message handler right? Are there potentially races here
with multiple threads interacting with StreamResultFuture and StreamManager and the global
state? Can they register different state and then end up with a different StreamResultFuture
or session? 

I don't believe this to be the case as the intiator sends out the {{StreamInitMessage}} and
subsequent control messages on the same (control) channel, so you have TCP ordering guarantees
there. Further, most of the important, state-changing functions related to {{StreamSession}}
are already {{synchronized}}.

bq. Ideally I think each stream session would get a dedicated control channel and a dedicated
actor thread to orchestrate control messages. Completely decouple the whole thing and turn
into a queue of events being processed serially. The only concurrency and parallelism is then
files being transferred in parallel which is pretty clearly shared nothing other than resource
limits which are generally pretty easy to reason about since there is no risk of deadlock.

The trick with that is we would need to send session identifier info in every message so that
the correct {{StreamSession}} can be identified. Otherwise, you have to stick a reference
to the {{StreamSession}} somewhere: in the channel attributes, in the {{StreamInboundHandler}},
or in the deserialization task (like what I currently have). Further, I'm now trying to avoid
altering the stream message format/protocol, so long as correctness is maintained, because
there's a boat load of other changes happening here.

bq. ... checksums ...
I did not change the checksum algorithm for the LZ4 compression on internode messaging as
that would be a change to the (undocumented) internode messaging protocol, and would break
upgrades. Further, Sylvain didn't want to change the internode messaging protocol for this
release, so I think we're stuck with the 32-bit checksumming as that's what we have in the
existing [{{OutboundTcpConnection}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L530].


> Move streaming to non-blocking IO and netty (streaming 2.1)
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-12229
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12229
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>             Fix For: 4.0
>
>
> As followup work to CASSANDRA-8457, we need to move streaming to use netty.
> Streaming 2.0 (CASSANDRA-5286) brought many good improvements to how files are transferred
between nodes in a cluster. However, the low-level details of the current streaming implementation
does not line up nicely with a non-blocking model, so I think this is a good time to review
some of those details and add in additional goodness. The current implementation assumes a
sequential or "single threaded" approach to the sending of stream messages as well as the
transfer of files. In short, after several iterative prototypes, I propose the following:
> 1) use a single bi-diredtional connection (instead of requiring to two sockets &
two threads)
> 2) send the "non-file" {{StreamMessage}} s (basically anything not {{OutboundFileMessage}})
via the normal internode messaging. This will require a slight bit more management of the
session (the ability to look up a {{StreamSession}} from a static function on {{StreamManager}},
but we have have most of the pieces we need for this already.
> 3) switch to a non-blocking IO model (facilitated via netty)
> 4) Allow files to be streamed in parallel (CASSANDRA-4663) - this should just be a thing
already
> 5) If the entire sstable is to streamed, in addition to the DATA component, transfer
all the components of the sstable (primary index, bloom filter, stats, and so on). This way
we can avoid the CPU and GC pressure from deserializing the stream into objects. File streaming
then amounts to a block-level transfer.
> Note: The progress/results of CASSANDRA-11303 will need to be reflected here, as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message