hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-202) Log Aggregation generates a storm of fsync() for namenode
Date Tue, 06 Nov 2012 00:44:12 GMT

     [ https://issues.apache.org/jira/browse/YARN-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Kihwal Lee updated YARN-202:

    Attachment: yarn-202.patch

The patch takes out hflush(). I think this is okay, but will appreciate other people's thought
on this.
> Log Aggregation generates a storm of fsync() for namenode
> ---------------------------------------------------------
>                 Key: YARN-202
>                 URL: https://issues.apache.org/jira/browse/YARN-202
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.0.2-alpha, 0.23.4
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: yarn-202.patch
> When the log aggregation is on, write to each aggregated container log causes hflush()
to be called. For large clusters, this can creates a lot of fsync() calls for namenode. 
> We have seen 6-7x increase in the average number of fsync operations compared to 1.0.x
on a large busy cluster. Over 99% of fsync ops were for log aggregation writing to tmp files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message