hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Singhi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9393) Hbase does not closing a closed socket resulting in many CLOSE_WAIT
Date Tue, 19 Jan 2016 16:43:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106977#comment-15106977

Ashish Singhi commented on HBASE-9393:

Thanks [~cmccabe] for the detailed comment. It was helpful.

I have modified code according to approach #2 and the DFSInputStream#unbuffer is closing the
socket. This is what we were initially looking for while deciding our approach to solve this
issue. We wanted to have control over the socket and close it rather than closing the complete
stream. Due to lack of knowledge about DFSInputStream we missed this api. I am testing this
with PE tool for random reads to see if the impact.

bq. Configuring HBase to periodically close open streams is not necessary; it's strictly worse
than option #2.
Agree, as mentioned above due to lack of knowledge about DFSInputStream we thought of that

bq. I believe there is an option do to #1 even right now. Can't HBase be configured just to
use pread and never read?
Looking at the code I find that we are specifically not using pread. There are comments like
{{// Seek + read. Better for scanning.}} and we are mainly using it for small scan (HBASE-9488).
So there may be strong reasons behind not using pread.

bq. Are you running out of file descriptors?

bq. What's the user-visible problem here?
Not able to perform any FS operation.

bq. Closing the HFile's inputStreams is not a good option because of its impact.
Agree, but keeping too many CLOSE_WAIT connections is also not good, right ? Assuming the
impact only we thought of making it configurable. Anyways we are now following approach #2
as solution.

> Hbase does not closing a closed socket resulting in many CLOSE_WAIT 
> --------------------------------------------------------------------
>                 Key: HBASE-9393
>                 URL: https://issues.apache.org/jira/browse/HBASE-9393
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.98.0
>         Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions
>            Reporter: Avi Zrachya
> HBase dose not close a dead connection with the datanode.
> This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the
datanode because too many mapped sockets from one host to another on the same port.
> The example below is with low CLOSE_WAIT count because we had to restart hbase to solve
the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT
> [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l
> 13156
> [root@hd2-region3 ~]# ps -ef |grep 21592
> root     17255 17219  0 12:26 pts/0    00:00:00 grep 21592
> hbase    21592     1 17 Aug29 ?        03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill
-9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase
-Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ...

This message was sent by Atlassian JIRA

View raw message