hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luo Ning (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-24) Scaling: Too many open file handles to datanodes
Date Sun, 11 Jan 2009 06:10:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662728#action_12662728
] 

Luo Ning commented on HBASE-24:
-------------------------------

sharing my experience here, hope it helpful:
1. since hbase never close mapfiles in normal usage, timeout of dfs.datanode.socket.write.timeout
should always be set.
2. xceiverCount should be set too, or there will be a stack memory overflow or reaching other
resource limit.
3. as my comments above, the only solution is limiting hbase concurrent open mapfiles, or
it will cause OOME as data size increasing.

posting my patch here:
1. this patch made for hbase 0.18.0, and it works well in last 3 month, about 500G data in
4 machines now;
2. the patch make HStoreFile..HbaseReader extends MonitoredReader instead of the original
MapFile.Reader, so we can control things in it;
3. see javadoc of MonitoredReader for more concurrent open controlling detail.

> Scaling: Too many open file handles to datanodes
> ------------------------------------------------
>
>                 Key: HBASE-24
>                 URL: https://issues.apache.org/jira/browse/HBASE-24
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.20.0
>
>
> We've been here before (HADOOP-2341).
> Today the rapleaf gave me an lsof listing from a regionserver.  Had thousands of open
sockets to datanodes all in ESTABLISHED and CLOSE_WAIT state.  On average they seem to have
about ten file descriptors/sockets open per region (They have 3 column families IIRC.  Per
family, can have between 1-5 or so mapfiles open per family -- 3 is max... but compacting
we open a new one, etc.).
> They have thousands of regions.   400 regions -- ~100G, which is not that much -- takes
about 4k open file handles.
> If they want a regionserver to server a decent disk worths -- 300-400G -- then thats
maybe 1600 regions... 16k file handles.  If more than just 3 column families..... then we
are in danger of blowing out limits if they are 32k.
> We've been here before with HADOOP-2341.
> A dfsclient that used non-blocking i/o would help applications like hbase (The datanode
doesn't have this problem as bad -- CLOSE_WAIT on regionserver side, the bulk of the open
fds in the rapleaf log, don't have a corresponding open resource on datanode end).
> Could also just open mapfiles as needed, but that'd kill our random read performance
and its bad enough already.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message