hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT
Date Mon, 29 Feb 2016 10:11:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171696#comment-15171696
] 

Anoop Sam John commented on HBASE-9393:
---------------------------------------

bq.can useHBaseChecksum vary amongst instances in the same JVM? if it can, then we shouldn't
be sharing cached information about the unbuffer call at all (or we need to have one of them
per streamClass). That would solve the "assign to static" business by moving to per-instance
caches of the reflection information.
The static boolean tells whether the underlying FS supports unbuffer call or not. Depending
on the useHBaseChecksum and stream it wont change.. So there is no confusion as such abt this.
Regarding the init of the static boolean, IMHO there is no need to worry abt the multi thread.
 We dont want every op to do the interfaces listing and check (String op).  Even if, at begin,
2 threads do it parallely, its ok.  Both will give same result only.  Pls add proper comments
here. And no need to add extra things to get away with findbugs warn.  Write clearly why no
worry abt findbugs comment here and we can add ignore annotate for the findbugs so that it
wont list this comment.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
>            Assignee: Ashish Singhi
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-9393.patch, HBASE-9393.v1.patch, HBASE-9393.v10.patch, HBASE-9393.v11.patch,
HBASE-9393.v12.patch, HBASE-9393.v13.patch, HBASE-9393.v2.patch, HBASE-9393.v3.patch, HBASE-9393.v4.patch,
HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v5.patch, HBASE-9393.v6.patch, HBASE-9393.v6.patch,
HBASE-9393.v6.patch, HBASE-9393.v7.patch, HBASE-9393.v8.patch, HBASE-9393.v9.patch
>
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the
datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve
the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill
-9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message