hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7979) Initialize block report IDs with a random number
Date Tue, 24 Mar 2015 19:24:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378415#comment-14378415

Colin Patrick McCabe commented on HDFS-7979:

I am not sure about this patch for a few reasons:
* When using the monotonic time, two block reports cannot get the same ID since the monotonic
time is always increasing.  We don't have the same guarantee here.  Admittedly, the chances
of a repeat are extremely low.  But previously they were effectively 0, and now they're nonzero.
* If the datanode is taken down and restarted, the monotonic time will still be higher than
before.  And so the current behavior makes it easy to see from the logs that block report
N+1 came after block report N, even if there was a datanode restart in between.  We don't
have this behavior with a random number generated on datanode start.

I also don't think a non-random block report ID is a security concern.  If block reports need
to be secured, the correct way to do it is to use encryption-over-the-wire via SASL.  If SASL
is not in use, any evildoer can submit a fake full block report that says that everything
is deleted, or talk about bogus blocks that don't really exist on the datanode.  Indeed, even
after this patch is applied, it would be easy for a black hat to submit a new block report
with a new random ID and cause the NN to delete all the storages on that DN.  So essentially
the motivation for this patch is not valid in my opinion.

> Initialize block report IDs with a random number
> ------------------------------------------------
>                 Key: HDFS-7979
>                 URL: https://issues.apache.org/jira/browse/HDFS-7979
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: HDFS-7979.001.patch
> Right now block report IDs use system nanotime. This isn't that random, so let's start
it at a random number for some more safety.

This message was sent by Atlassian JIRA

View raw message