hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3779) limit concurrent connections(data serving thread) in one datanode
Date Fri, 18 Jul 2008 21:47:33 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614888#action_12614888
] 

Raghu Angadi commented on HADOOP-3779:
--------------------------------------

yes, in fact you may not like like the 256 limitation at all.

In any case, if you just want to close any client connection that is idle (for say 1 sec),
that needs to be handled at the DataNode level and not at SelectorPool. SelectorPool is an
implementation detail of a utility to do blocking IO with NIO sockets.  From your brief description,
your suggested fix does not seem like some thing very useful and is at wrong level (kind of
like writing a kernel module to close an idle socket :) ) . May be a detailed description
or better a simple prototype implementation will make it more clear.

Note that we need to rewrite data transfer code paths in DataNode to do real async transfer
(network transfers are easy, but datanode needs to do disk I/O). I would sooner or later DataNode
needs to do that.. it can not continue to live with one thread per connection.

I am thinking of proposing a design for "async data transfers" if there is enough interest.
 Basic idea is to share a pool of threads (we need a pool to do disk I/O) to handle all the
clients transfers.. something like 5 or so per disk. This requires substantial rewrite of
readBlock() and writeBlock() code paths in Datanode.

> limit concurrent connections(data serving thread) in one datanode
> -----------------------------------------------------------------
>
>                 Key: HADOOP-3779
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3779
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.17.1
>            Reporter: LN
>            Priority: Minor
>
> i'm here after HADOOP-2341 and HADOOP-2346, in my hbase env, many opening mapfiles cause
datanode OOME(stack memory), because 2000+ data serving threads in datanode process.
> although HADOOP-2346 has implements timeouts, it will be some situation many connection
created  before the read timeout(default 6min) reach. like hbase does, it open all files on
regionserver startup. 
> limit concurrent connections(data serving thread) will make datanode more stable. and
i think it could be done in SocketIOWithTimeout$SelectorPool#select:
> 1. in SelectorPool#select, record all waiting SelectorInfo instances in a List at the
beginning, and remove it after 'Selector#select' done.
> 2. before real 'select',  do a limitation check, if reached, close the first selectorInfo.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message