Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2DEE186D7 for ; Sat, 13 Feb 2016 01:03:18 +0000 (UTC) Received: (qmail 82187 invoked by uid 500); 13 Feb 2016 01:03:18 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 82135 invoked by uid 500); 13 Feb 2016 01:03:18 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 82116 invoked by uid 99); 13 Feb 2016 01:03:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Feb 2016 01:03:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 830DA2C1F6B for ; Sat, 13 Feb 2016 01:03:18 +0000 (UTC) Date: Sat, 13 Feb 2016 01:03:18 +0000 (UTC) From: "Ashish Singhi (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-9393: --------------------------------- Release Note: To handle this issue user need to have Hadoop client 2.6.4 or 2.7.0+ Hadoop version as CanUnBuffer interface which was added as part of HDFS-7694 is available in those versions only. So from HBase side we have fixed this issue only in 2.0.0 version as by default the Hadoop version used there is 2.7.1. If a user is using Hadoop 2.6.4 or Hadoop 2.7.0+ version then they can back port this issue in their version to handle this issue. > Hbase does not closing a closed socket resulting in many CLOSE_WAIT > -------------------------------------------------------------------- > > Key: HBASE-9393 > URL: https://issues.apache.org/jira/browse/HBASE-9393 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.2, 0.98.0 > Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions > Reporter: Avi Zrachya > Assignee: Ashish Singhi > Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch > > > HBase dose not close a dead connection with the datanode. > This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the datanode because too many mapped sockets from one host to another on the same port. > The example below is with low CLOSE_WAIT count because we had to restart hbase to solve the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT > [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l > 13156 > [root@hd2-region3 ~]# ps -ef |grep 21592 > root 17255 17219 0 12:26 pts/0 00:00:00 grep 21592 > hbase 21592 1 17 Aug29 ? 03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)