Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7C822200BFF for ; Tue, 3 Jan 2017 05:37:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7B089160B30; Tue, 3 Jan 2017 04:37:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C53E5160B22 for ; Tue, 3 Jan 2017 05:36:59 +0100 (CET) Received: (qmail 37508 invoked by uid 500); 3 Jan 2017 04:36:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 37482 invoked by uid 99); 3 Jan 2017 04:36:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jan 2017 04:36:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 9048A2C2A6B for ; Tue, 3 Jan 2017 04:36:58 +0000 (UTC) Date: Tue, 3 Jan 2017 04:36:58 +0000 (UTC) From: "Weiwei Yang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 03 Jan 2017 04:37:00 -0000 [ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794115#comment-15794115 ] Weiwei Yang commented on HDFS-11280: ------------------------------------ Hi [~jzhuge] I did not try all failed tests, but at least this patch is breaking {{TestWebHDFS#testCreateWithNoDN}}, I could get this test pass if I revert the changes this patch has made. Please check. > Allow WebHDFS to reuse HTTP connections to NN > --------------------------------------------- > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 > Reporter: Zheng Shao > Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. When we use webhdfs as the source in distcp, this used up all ephemeral ports on the client side since all closed connections continue to occupy the port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call conn.getInputStream().close() instead to make sure the connection is kept alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org