incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kishore gopalakrishna (Commented) (JIRA)" <>
Subject [jira] [Commented] (S4-7) Netty to tolerate network glitches and connection loss
Date Wed, 07 Dec 2011 23:06:40 GMT


kishore gopalakrishna commented on S4-7:

Karthik, can you add a test case where you start both emitter and listener in the same process.

start the listener
start the emitter
Let the emitter send numbers 1-100000 
listener just gets it and add the number or something
in between close the listener and restart it after a second. ( you can do this more than once
if needed)

In the end listener should have got all the numbers. 

As Matthieu mentioned this is not a requirement to not lose messages, but a nice to have.
Its upto you to decide.

Regarding code,
Remove commented code if not needed. Rename TestHandler.

if (!channel.isWritable()) {
            synchronized (sendLock) {
                // check again now that we have the lock
                while (!channel.isWritable()) {
                    try {
                    } catch (InterruptedException ie) {
                        return false;

This might result in deadlock if the receiver node is permanently down. Take the case where
you got the channel but it was not writable and after it is never writable and another node
is now serving hosting this partition. You probably need to get the channel from the channelMap
inside the while after waking up from wait.

> Netty to tolerate network glitches and connection loss
> ------------------------------------------------------
>                 Key: S4-7
>                 URL:
>             Project: Apache S4
>          Issue Type: Bug
>            Reporter: Leo Neumeyer
>            Assignee: Karthik Kambatla
>             Fix For: 0.5
> NettyEmitter connects to different partitions and creates channels over which it communicates
to other listeners.
> It suffers from the following issues -- 
> 1. If the underlying topology changes, the channels and the associated connections are
not updated.
> 2. If a connection gets disconnected, it stays disconnected.
> 3. If for any reason, a connection can't be made, send() drops the message to be sent.
> The solution is to - 
> 1. Maintain a bounded messageQueue for each destination partition - if a connection does
not exist, the message should be queued.
> 2. Maintain a map of the channel used for each destination partition - update this map
on changes to topology, or on send() in case of disconnections.
> 3. Every time a (re-)connection is made, send the queued messages first.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message