hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-8965) Harden edit log reading code against out of memory errors
Date Fri, 28 Aug 2015 00:53:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717849#comment-14717849
] 

Colin Patrick McCabe edited comment on HDFS-8965 at 8/28/15 12:52 AM:
----------------------------------------------------------------------

Added checksumming for scanOp.  I added a unit test that scanOp now verifies checksums, and
verified that it failed on trunk but passed with the patch.  It's a true unit test which doesn't
start a miniDFSCluster.  Fixed the typo in EDITS_CHEKSUM.

I agree that there is a very good chance that a large array would be allocated on the stack.
 But there's also a chance that it wouldn't.  Since the difference in verboseness is negligable
(its a single extra line) it seems like we should just allocate it inside the Reader.


was (Author: cmccabe):
Added checksumming for scanOp.  I added a unit test that scanOp now verifies checksums, and
verified that it failed on trunk but passed with the patch.  It's a true unit test which doesn't
start a miniDFSCluster.  Fixed the typo in EDITS_CHEKSUM.

I agree that there is a very good chance that a large array would be allocated on the stack.
 But there's also a chance that it wouldn't.  Since the difference in verboseness is negligable
(its a single extra line) it seems like we should just allocate it inside the Reader.  Startup
time is one area where we are weak now and we should be trying to optimize this.

> Harden edit log reading code against out of memory errors
> ---------------------------------------------------------
>
>                 Key: HDFS-8965
>                 URL: https://issues.apache.org/jira/browse/HDFS-8965
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch
>
>
> We should harden the edit log reading code against out of memory errors.  Now that each
op has a length prefix and a checksum, we can validate the checksum before trying to load
the Op data.  This should avoid out of memory errors when trying to load garbage data as Op
data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message