hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1062) Compactions at (re)start on a large table can overwhelm DFS
Date Mon, 15 Dec 2008 03:57:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656540#action_12656540

Andrew Purtell commented on HBASE-1062:

Our 25 node cluster layout is:

1: namenode, datanode
2: datanode, hmaster, jobtracker
3-25: datanode, regionserver, tasktracker

We run datanodes everywhere because each node has 2.5TB of storage that we'd clearly like
to include in the DFS volume. 

Tasktrackers do not run on the semi-dedicated namenode node nor the semi-dedicated hmaster
node. There is a HRS running alongside every TT. Each TT is configured to allow only four
concurrent tasks -- 2 mappers and/or 2 reducers. Some of our tasks can be heavy, running with
1G heap, etc. Especially the document parser really loads CPU, RAM, and DFS while the mappers
crunch away. 

Right now our average load is around 50 also.

> Compactions at (re)start on a large table can overwhelm DFS
> -----------------------------------------------------------
>                 Key: HBASE-1062
>                 URL: https://issues.apache.org/jira/browse/HBASE-1062
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Critical
>             Fix For: 0.20.0
> Given a large table, > 1000 regions for example, if a cluster restart is necessary,
the compactions undertaken by the regionservers when the master makes initial region assignments
can overwhelm DFS, leading to file errors and data loss. This condition is exacerbated if
write load was heavy before restart and so many regions want to split as soon as they are

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message