hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (MAPREDUCE-6850) Shuffle Handler keep-alive connections are closed from the server side
Date Mon, 27 Feb 2017 02:16:45 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884985#comment-15884985
] 

Rajesh Balamohan edited comment on MAPREDUCE-6850 at 2/27/17 2:15 AM:
----------------------------------------------------------------------

I checked in small multi-node cluster with the patch. Attaching the tcpdump screenshots for
reference. Patch works fine with keep-alive enabled and connections are being reused, where
mapOutputs are retrieved using same connection. Attachment "With_Patch.png" shows the TCP
stream, where multiple mapOutput being fetched from same connection.

One very minor comment in the patch.  {{timer}} variable in {{HttpPipelineFactory}} may not
be needed.

In MAPREDUCE-5787, Keepalive parameter checks were there till https://issues.apache.org/jira/secure/attachment/12634984/MAPREDUCE-5787-2.4.0-v3.patch
as follows. 
{noformat}
if (!keepAlive && !keepAliveParam) {
  lastMap.addListener(ChannelFutureListener.CLOSE);
}
{noformat}

However, during refactoring it got missed out in subsequent patches in the same JIRA. That
caused this problem. However, It would have relied on client to close the connection. I.e
it was the responsibility of the client (JDK's internal http client) to terminate the connection
after keep-alive timeout. Current patch proposed in this JIRA addresses that scenario as well,
where in it would automatically close the connection if timeout exceeds the threshold provided
in server side.





was (Author: rajesh.balamohan):
I checked in small multi-node cluster with the patch. Attaching the tcpdump screenshots for
reference. Patch works fine with keep-alive enabled and connections are being reused, where
mapOutputs are retrieved using same connection. Attachment "With_Patch.png" shows the TCP
stream, where multiple mapOutput being fetched from same connection.

One very minor comment in the patch.  {{timer}} variable in {{HttpPipelineFactory}} may not
be needed.

In MAPREDUCE-5787, Keepalive parameter checks were there till https://issues.apache.org/jira/secure/attachment/12634984/MAPREDUCE-5787-2.4.0-v3.patch
as follows. 
{noformat}
if (!keepAlive && !keepAliveParam) {
  lastMap.addListener(ChannelFutureListener.CLOSE);
}
{noformat}

However, during refactoring it got missed out in subsequent patches. That caused this problem.
However, It would have relied on client to close the connection. I.e it was the responsibility
of the client (JDK's internal http client) to terminate the connection after keep-alive timeout.
Current patch proposed in this JIRA addresses that scenario as well, where in it would automatically
close the connection if timeout exceeds the threshold provided in server side.




> Shuffle Handler keep-alive connections are closed from the server side
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6850
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6850
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: MAPREDUCE-6850.1.patch, MAPREDUCE-6850.2.patch, MAPREDUCE-6850.3.patch,
With_Issue.png, With_Patch.png, With_Patch_withData.png
>
>
> When performance testing tez shuffle handler (TEZ-3334), it was noticed the keep-alive
connections are closed from the server-side. The client silently recovers and logs the connection
as keep-alive, despite reestablishing a connection. This jira aims to remove the close from
the server side, fixing the bug preventing keep-alive connections.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message