hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Singhi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT
Date Thu, 14 Jan 2016 18:08:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098533#comment-15098533
] 

Ashish Singhi commented on HBASE-9393:
--------------------------------------

Analysis so far,

If the socket is idle for a configured amount of time then Datanode will close that socket
but the client still has it open, so it goes into a CLOSE_WAIT state and that socket is of
no use now.
Since the stream to the HFile is still open and when HBase do any operation on that HFile
then the HDFS client closes that socket and will open a new socket.
So the only advantage HBase gets from keeping this stream open is, it saves a Namenode open
call for that HFile but keeping so many CLOSE_WAIT connections also does not seems good.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the
datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve
the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill
-9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message