hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1112) Edit log buffer should not grow unboundedly
Date Fri, 30 Apr 2010 22:29:55 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862909#action_12862909

Hairong Kuang commented on HDFS-1112:

Here is the proposal:
# Add a method getUnflushedDataLen() to EditLogOutputStream that returns the length of buffered
edit logs that need to be flushed.
# After each edit log entry is written (FSEditLog#logEdit), check the length of the unflushed
edit logs. If it is bigger than or equal to the initial buffer size, which is 512K for now,
all edit log streams are automatically flushed and synced to disks.

This proposal does allow edit log buffer to grow beyond the initial buffer size, but the max
buffer size is really bounded the max length of an edit log entry. In most cases, I believe
that the buffer can grow up to 1M bytes. The advantage of not shrinking the edit log buffer
is that it won't cause frequent buffer allocations & deallocations, so avoiding frequent
GCs if a large amount of open hits NameNode in a short time.

> Edit log buffer should not grow unboundedly
> -------------------------------------------
>                 Key: HDFS-1112
>                 URL: https://issues.apache.org/jira/browse/HDFS-1112
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>             Fix For: 0.22.0
> Currently HDFS does not impose an upper limit on the edit log buffer. In case there are
a large number of open operations coming in with access time update on, since open does not
call sync automatically, there is a possibility that the buffer grow to a large size, therefore
causes memory leak and full GC in extreme cases as described in HDFS-1104. 
> The edit log buffer should be automatically flushed when the buffer becomes full.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message