hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT
Date Wed, 15 Jun 2016 06:03:09 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331202#comment-15331202
] 

Chris Nauroth commented on HBASE-9393:
--------------------------------------

With short-circuit read, there is still a brief interaction between the HDFS client and the
DataNode's data transfer port to request sharing file descriptors to perform the direct read.
 It's much less data compared to an actual block transfer, but the TCP socket is still there.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0, 1.0.1.1, 1.1.2
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
>            Assignee: Ashish Singhi
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, HBASE-9393.v10.patch, HBASE-9393.v11.patch,
HBASE-9393.v12.patch, HBASE-9393.v13.patch, HBASE-9393.v14.patch, HBASE-9393.v15.patch, HBASE-9393.v15.patch,
HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch,
HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch, HBASE-9393.v7.patch,
HBASE-9393.v8.patch, HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the
datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve
the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill
-9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message