Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C92CB200C28 for ; Mon, 27 Feb 2017 03:16:57 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C7BCC160B78; Mon, 27 Feb 2017 02:16:57 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1BF91160B77 for ; Mon, 27 Feb 2017 03:16:56 +0100 (CET) Received: (qmail 82663 invoked by uid 500); 27 Feb 2017 02:16:51 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 82652 invoked by uid 99); 27 Feb 2017 02:16:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Feb 2017 02:16:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id CBDE7C07B2 for ; Mon, 27 Feb 2017 02:16:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.451 X-Spam-Level: * X-Spam-Status: No, score=1.451 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id qIzGA29DXtyy for ; Mon, 27 Feb 2017 02:16:50 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B313F60E17 for ; Mon, 27 Feb 2017 02:16:49 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2132CE0630 for ; Mon, 27 Feb 2017 02:16:46 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 64E7424138 for ; Mon, 27 Feb 2017 02:16:45 +0000 (UTC) Date: Mon, 27 Feb 2017 02:16:45 +0000 (UTC) From: "Rajesh Balamohan (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (MAPREDUCE-6850) Shuffle Handler keep-alive connections are closed from the server side MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 27 Feb 2017 02:16:58 -0000 [ https://issues.apache.org/jira/browse/MAPREDUCE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884985#comment-15884985 ] Rajesh Balamohan edited comment on MAPREDUCE-6850 at 2/27/17 2:15 AM: ---------------------------------------------------------------------- I checked in small multi-node cluster with the patch. Attaching the tcpdump screenshots for reference. Patch works fine with keep-alive enabled and connections are being reused, where mapOutputs are retrieved using same connection. Attachment "With_Patch.png" shows the TCP stream, where multiple mapOutput being fetched from same connection. One very minor comment in the patch. {{timer}} variable in {{HttpPipelineFactory}} may not be needed. In MAPREDUCE-5787, Keepalive parameter checks were there till https://issues.apache.org/jira/secure/attachment/12634984/MAPREDUCE-5787-2.4.0-v3.patch as follows. {noformat} if (!keepAlive && !keepAliveParam) { lastMap.addListener(ChannelFutureListener.CLOSE); } {noformat} However, during refactoring it got missed out in subsequent patches in the same JIRA. That caused this problem. However, It would have relied on client to close the connection. I.e it was the responsibility of the client (JDK's internal http client) to terminate the connection after keep-alive timeout. Current patch proposed in this JIRA addresses that scenario as well, where in it would automatically close the connection if timeout exceeds the threshold provided in server side. was (Author: rajesh.balamohan): I checked in small multi-node cluster with the patch. Attaching the tcpdump screenshots for reference. Patch works fine with keep-alive enabled and connections are being reused, where mapOutputs are retrieved using same connection. Attachment "With_Patch.png" shows the TCP stream, where multiple mapOutput being fetched from same connection. One very minor comment in the patch. {{timer}} variable in {{HttpPipelineFactory}} may not be needed. In MAPREDUCE-5787, Keepalive parameter checks were there till https://issues.apache.org/jira/secure/attachment/12634984/MAPREDUCE-5787-2.4.0-v3.patch as follows. {noformat} if (!keepAlive && !keepAliveParam) { lastMap.addListener(ChannelFutureListener.CLOSE); } {noformat} However, during refactoring it got missed out in subsequent patches. That caused this problem. However, It would have relied on client to close the connection. I.e it was the responsibility of the client (JDK's internal http client) to terminate the connection after keep-alive timeout. Current patch proposed in this JIRA addresses that scenario as well, where in it would automatically close the connection if timeout exceeds the threshold provided in server side. > Shuffle Handler keep-alive connections are closed from the server side > ---------------------------------------------------------------------- > > Key: MAPREDUCE-6850 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6850 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Jonathan Eagles > Assignee: Jonathan Eagles > Attachments: MAPREDUCE-6850.1.patch, MAPREDUCE-6850.2.patch, MAPREDUCE-6850.3.patch, With_Issue.png, With_Patch.png, With_Patch_withData.png > > > When performance testing tez shuffle handler (TEZ-3334), it was noticed the keep-alive connections are closed from the server-side. The client silently recovers and logs the connection as keep-alive, despite reestablishing a connection. This jira aims to remove the close from the server side, fixing the bug preventing keep-alive connections. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org