hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors
Date Thu, 05 Nov 2015 19:18:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992293#comment-14992293

Colin Patrick McCabe commented on HDFS-8965:

bq. 1. What do we do with the corrupted edit log entry? skip it?

We only would attempt to skip corrupted edit log entries if recovery mode were turned on.
 Otherwise, an exception would be thrown, which would ordinarily lead to us trying to read
a different copy of the edit log.

bq. 2. Do we have idea about whether it is a bug that corrupt edit log, or is it a disk/network
problem that corrupt the edit log?

In general, I don't see how we could know the answer to that question without more information.

> Harden edit log reading code against out of memory errors
> ---------------------------------------------------------
>                 Key: HDFS-8965
>                 URL: https://issues.apache.org/jira/browse/HDFS-8965
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 2.8.0
>         Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch,
HDFS-8965.005.patch, HDFS-8965.006.patch, HDFS-8965.007.patch
> We should harden the edit log reading code against out of memory errors.  Now that each
op has a length prefix and a checksum, we can validate the checksum before trying to load
the Op data.  This should avoid out of memory errors when trying to load garbage data as Op

This message was sent by Atlassian JIRA

View raw message