hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8178) QJM doesn't move aside stale inprogress edits files
Date Thu, 30 Apr 2015 06:55:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14521002#comment-14521002
] 

Zhe Zhang commented on HDFS-8178:
---------------------------------

Thanks ATM for the helpful review! After looking at HDFS-5919 more closely, we are actually
trying to solve a different problem here. The objective of HDFS-5919 is sorely to save disk
space (since FJM doesn't try to process those corrupt/empty files anyway). It's a safe cleanup,
making sure the tx ID of empty / corrupt files are old enough before purging. So I think we
should do the same in QJM.

Our main target here is _stale_ in-progress edit log files, which are not necessarily empty/corrupt
(so they won't be mark as so). As the updated description states, we want to properly take
care of those files so QJM doesn't try to process them. I like your proposal of rename / move
aside those files and remove them when they are older than {{minTxIdToKeep}}. I'll update
the patch based on this idea.

I also propose we do the same for corrupt / empty files, for both FJM and QJM. 

> QJM doesn't move aside stale inprogress edits files
> ---------------------------------------------------
>
>                 Key: HDFS-8178
>                 URL: https://issues.apache.org/jira/browse/HDFS-8178
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: qjm
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8178.000.patch
>
>
> When a QJM crashes, the in-progress edit log file at that time remains in the file system.
When the node comes back, it will accept new edit logs and those stale in-progress files are
never cleaned up. QJM treats them as regular in-progress edit log files and tries to finalize
them, which potentially causes high memory usage. This JIRA aims to move aside those stale
edit log files to avoid this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message