Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 5 Jun 2017 15:36:05 +0000 (UTC)
From: "Sean Busbey (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12666348.1377880286000.365839.1496676965225@Atlassian.JIRA>
In-Reply-To: <JIRA.12666348.1377880286000@Atlassian.JIRA>
References: <JIRA.12666348.1377880286000@Atlassian.JIRA> <JIRA.12666348.1377880286333@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HBASE-9393) Hbase does not closing a closed
 socket resulting in many CLOSE_WAIT
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Mon, 05 Jun 2017 15:36:13 -0000


    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037096#comment-16037096 ] 

Sean Busbey commented on HBASE-9393:
------------------------------------

I think I still disagree here:

{quote}
bq. The addition of the unbuffer call here means that we need to update the javadocs for HFile.createReader(FileSystem, Path, FSDataInputStreamWrapper, long, CacheConfig, Configuration) and HFile.createReaderFromStream(Path, FSDataInputStream, long, CacheConfig, Configuration) to note that callers need to ensure no other threads have access to the passed FSDISW instance.
bq. We should also ensure that existing calls to those methods are safely passing the FSDISW instance.

No need, the new reference of FSDISW is just created and passed from this methods.
{quote}

This response sounds like the second half of my concern is addressed; we currently safely pass FSDISW instances. But without the change to the javadoc there's effectively no warning for those who might reuse those methods in the future.

I think the need to fix this issue outweighs the risk of future incorrect use due to missing the javadoc, so I'm -0.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0, 1.0.1.1, 1.1.2
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
>            Assignee: Ashish Singhi
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-9393.patch, HBASE-9393.v10.patch, HBASE-9393.v11.patch, HBASE-9393.v12.patch, HBASE-9393.v13.patch, HBASE-9393.v14.patch, HBASE-9393.v15.patch, HBASE-9393.v15.patch, HBASE-9393.v16.patch, HBASE-9393.v16.patch, HBASE-9393.v1.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)