hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3810) NameNode seems unstable on a cluster with little space left
Date Mon, 30 Mar 2009 21:39:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Hairong Kuang updated HADOOP-3810:

    Attachment: globalLock.patch

This patch aims to improve NN responsiveness when a cluster is near full. It identified one
cause of the problem is that ReplicationMonitor holds the global lock while replicating blocks
in one iteration. So this patch made the following change:
1. does not allow ReplicationMonitor to hold the fsnamesystem global lock while replicating
block in one iteration. Previously the logic is like:
synchronized computeReplicationWork( int blocksToProcess) {
  for (int i=0; i<blocksToProcess; i++) {
    select one block from under-replicated queue;
    Compute replication work for the block;
Now it is changed to be:
computeReplicationWork(int blocksToProcess) {
  (synchronized) select blocksToProcess under-replicated blocks;
  for each selected block {
    (synchronized)compute replication work for the block;
2. While computing replication work for a block, releases the global lock while choosing targets
because this is the most computation intensive part.

> NameNode seems unstable on a cluster with little space left
> -----------------------------------------------------------
>                 Key: HADOOP-3810
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3810
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.1
>            Reporter: Raghu Angadi
>            Assignee: Hairong Kuang
>             Fix For: 0.21.0
>         Attachments: globalLock.patch, simon-namenode.PNG
> NameNode seems not very responsive and unstable when the cluster has very little space
left. The clients timeout. The main problem is that it is not clear to the user what is going
on. Once I have more details about a NameNode that was in this state, I will fill in here.
> If there is not enough space left on a cluster, it is ok for clients to receive something
like "DiskOutOfSpace" exception. 
> Right now it looks like NameNode tries too hard find a node with any space left and ends
up being slow to respond to clients. If the CPU taken by chooseTarger() is the main cause,
there are two possible fixes :
> # chooseTarget() iterates and takes quite a bit of CPU for allocating datanodes. Usually
this not much of a problem. It takes even more cpu when it needs to search multiple racks
for a datanode. We could probably reduce some CPU for these searches. The benefit should be
> # Once NameNode can not find any datanode that has space on a rack, it could mark the
rack as "full" and skip searching the rack for next one minute or so. This flag gets cleared
after a minute or if any new node is added to the rack.
> #* Of course, this might not be optimal w.r.t disk space usage.. but only for a short
duration. Once a cluster is mostly full, the user does expect errors.
> #* On the flip side, this fix does not require extremely CPU optimized version of chooseTarget().

> #* I think it is reasonable for NameNode to throw DiskOutOfSpace exception, even though
it could have found space if it searched much more extensively.
> ---
> edit : minor changes

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message