hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1104) Fsck triggers full GC on NameNode
Date Tue, 27 Apr 2010 20:42:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861530#action_12861530
] 

dhruba borthakur commented on HDFS-1104:
----------------------------------------

> Turning this off by default will benefit most of our users.

I do not support this idea. It will break quite a few  existing pipelines for us.


> For fsck, I'd like to propose not to update access time and provide no configuration
to turn it off

+1. fsck should not update the access time of files.

> Fsck triggers full GC on NameNode
> ---------------------------------
>
>                 Key: HDFS-1104
>                 URL: https://issues.apache.org/jira/browse/HDFS-1104
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.20.3, 0.21.0, 0.22.0
>
>
> A NameNode at one of our clusters fell into full GC while fsck was performed. Digging
into the problem shows that it is caused by how NameNode handles the access time of a file.
> Fsck calls open on every file in the checked directory to get the file's block locations.
Each open changes the file's access time and then leads to writing a transaction entry to
the edit log. The current code optimizes open so that it returns without issuing synchronizing
the edit log to the disk. It happened that in our cluster no other jobs were running while
fsck was performed. No edit log sync was ever called. So all open transactions were kept in
memory. When the edit log buffer got full, it automatically doubled its space by allocating
a new buffer.  Full GC happened when no contiguous space were found when allocating a new
bigger buffer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message