hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files
Date Thu, 30 Apr 2015 06:55:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521002#comment-14521002

Zhe Zhang commented on HDFS-8178:

Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we are actually
trying to solve a different problem here. The objective of HDFS-5919 is sorely to save disk
space (since FJM doesn't try to process those corrupt/empty files anyway). It's a safe cleanup,
making sure the tx ID of empty / corrupt files are old enough before purging. So I think we
should do the same in QJM.

Our main target here is _stale_ in-progress edit log files, which are not necessarily empty/corrupt
(so they won't be mark as so). As the updated description states, we want to properly take
care of those files so QJM doesn't try to process them. I like your proposal of rename / move
aside those files and remove them when they are older than {{minTxIdToKeep}}. I'll update
the patch based on this idea.

I also propose we do the same for corrupt / empty files, for both FJM and QJM. 

> QJM doesn't move aside stale inprogress edits files
> ---------------------------------------------------
>                 Key: HDFS-8178
>                 URL: https://issues.apache.org/jira/browse/HDFS-8178
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: qjm
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8178.000.patch
> When a QJM crashes, the in-progress edit log file at that time remains in the file system.
When the node comes back, it will accept new edit logs and those stale in-progress files are
never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize
them, which potentially causes high memory usage. This JIRA aims to move aside those stale
edit log files to avoid this scenario.

This message was sent by Atlassian JIRA

View raw message