hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "steven xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11833) Hbase does not closing a closed socket resulting in thousand of CLOSE_WAIT sockets
Date Thu, 28 Aug 2014 09:03:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113563#comment-14113563
] 

steven xu commented on HBASE-11833:
-----------------------------------

Guys, before create this issue, I have read the [HBASE-9393] and [HDFS-5671]. I found the
patch code of these two Issues have added into Hadoop 2.4.0 tag in class BlockReaderFactory.getRemoteBlockReaderFromTcp().
 So the [HBASE-9393] patch donot solve my problem. Another bug maybe lead my problem, so I
created a new issue. Please check also. 
{code:title=Bar.java|borderStyle=solid}
// Some comments here
  private BlockReader getRemoteBlockReaderFromTcp() throws IOException {
    if (LOG.isTraceEnabled()) {
      LOG.trace(this + ": trying to create a remote block reader from a " +
          "TCP socket");
    }
    BlockReader blockReader = null;
    while (true) {
      BlockReaderPeer curPeer = null;
      Peer peer = null;
      try {
        curPeer = nextTcpPeer();
        if (curPeer == null) break;
        if (curPeer.fromCache) remainingCacheTries--;
        peer = curPeer.peer;
        blockReader = getRemoteBlockReader(peer);
        return blockReader;
      } catch (IOException ioe) {
        if (isSecurityException(ioe)) {
          if (LOG.isTraceEnabled()) {
            LOG.trace(this + ": got security exception while constructing " +
                "a remote block reader from " + peer, ioe);
          }
          throw ioe;
        }
        if ((curPeer != null) && curPeer.fromCache) {
          // Handle an I/O error we got when using a cached peer.  These are
          // considered less serious, because the underlying socket may be
          // stale.
          if (LOG.isDebugEnabled()) {
            LOG.debug("Closed potentially stale remote peer " + peer, ioe);
          }
        } else {
          // Handle an I/O error we got when using a newly created peer.
          LOG.warn("I/O error constructing remote block reader.", ioe);
          throw ioe;
        }
      } finally {
        if (blockReader == null) {
          IOUtils.cleanup(LOG, peer);
        }
      }
    }
    return null;
  }
{code}

> Hbase does not closing a closed socket resulting in thousand of CLOSE_WAIT sockets
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-11833
>                 URL: https://issues.apache.org/jira/browse/HBASE-11833
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.0
>         Environment: RHEL 6.3 -HDP 2.1 -6 RegionServers/Datanode -18T per node -3108Regions
>            Reporter: steven xu
>
> HBase dose not close a dead connection with the datanode.
> This resulting in over 30K+ CLOSE_WAIT and at some point HBase can not connect to the
datanode because too many mapped sockets from one host to another on the same port:50010.

> After I restart all RSs,  the count of CLOSE_WAIT will increase always.
> $ netstat -an|grep CLOSE_WAIT|wc -l
> 2545
> # netstat -nap|grep CLOSE_WAIT|grep 6569|wc -l
> 2545
> # ps -ef|grep 6569
> hbase     6569  6556 21 Aug25 ?        09:52:33 /opt/jdk1.6.0_25/bin/java -Dproc_regionserver
-XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC 
> I aslo have reviewed these issues:
> [HBASE-9393]
> [HDFS-5671|https://issues.apache.org/jira/browse/HDFS-5671]
> [HDFS-1836|https://issues.apache.org/jira/browse/HDFS-1836]
> I found HBase 0.98/Hadoop 2.4.0 I uesed which source codes are not different from these
patches.
> But I donot understand why HBase 0.98/Hadoop 2.4.0 also have this isssue. Please check.
Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message