hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1325) DFSClient(DFSInputStream) release the persistent connection with datanode when no data have been read for a long time
Date Mon, 02 Aug 2010 03:37:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894456#action_12894456

Todd Lipcon commented on HDFS-1325:

I don't quite follow. For one, the patch doesn't seem to make much sense - it sends an OP_STATUS_TIMEOUT
to the server even when the server may not have completed sending the data, but doesn't actually
close the socket on the client side. So if the server is still trying to send data, the timeout
doesn't achieve anything, right?

I also don't quite see what you mean about causing system CPU on region server to go up. The
resource consumption is only of extra open file handles - it shouldn't affect CPU usage at
all to have idle sockets open. It's true that you need to bump up the xceiver count and ulimit
on DNs for HBase, but once you've done that it doesn't cause big issues in practice.

There are a number of other JIRAs already open to work on the general issue of socket efficiency
- eg HDFS-918, HDFS-941, etc.

> DFSClient(DFSInputStream) release the persistent connection with datanode when no data
have been read for a long time
> ---------------------------------------------------------------------------------------------------------------------
>                 Key: HDFS-1325
>                 URL: https://issues.apache.org/jira/browse/HDFS-1325
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>            Reporter: jinglong.liujl
>             Fix For: 0.20.3
>         Attachments: dfsclient.patch
> When you use Hbase over hadoop. We found during scanning over a large table ( which has
many regions and each region has many store files), there're too many connections has been
kept between regionserver (act as DFSClient) and datanode.  Even if the store file has been
complete to scanning, the connections can not be closed.
> In our cluster, too many extra connections cause too many system resource has been wasted,
which cause system cpu on region server reach to a high level, then bring this region server
> After investigating, we found the number of active connection is very small, and the
most connection is idle. We add a timeout checker thread into DFSClient, to close this connection.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message