hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2541) For a sufficiently large value of blocks, the DN Scanner may request a random number with a negative seed value.
Date Sun, 20 Nov 2011 03:45:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153669#comment-13153669

Harsh J commented on HDFS-2541:

Thanks Eli!

Would it also make sense if we 'warn' when block #s surpass a particularly large value? Could
open a new ticket for this if you think it makes sense to do that.
> For a sufficiently large value of blocks, the DN Scanner may request a random number
with a negative seed value.
> ----------------------------------------------------------------------------------------------------------------
>                 Key: HDFS-2541
>                 URL: https://issues.apache.org/jira/browse/HDFS-2541
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1
>            Reporter: Harsh J
>            Assignee: Harsh J
>             Fix For:, 0.23.1
>         Attachments: BSBugTest.java, HDFS-2541.patch
> Running off 0.20-security, I noticed that one could get the following exception when
scanners are used:
> {code}
> DataXceiver 
> java.lang.IllegalArgumentException: n must be positive 
> at java.util.Random.nextInt(Random.java:250) 
> at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.getNewBlockScanTime(DataBlockScanner.java:251)

> at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.addBlock(DataBlockScanner.java:268)

> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:432)

> at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:122)
> {code}
> This is cause the period, determined in the DataBlockScanner (0.20+) or BlockPoolSliceScanner
(0.23+), is cast to an integer before its sent to a Random.nextInt(...) call. For sufficiently
large values of the long 'period', the casted integer may be negative. This is not accounted
for. I'll attach a sample test that shows this possibility with the numbers.
> We should ensure we do a Math.abs(...) before we send it to the Random.nextInt(...) call
to avoid this.
> With this bug, the maximum # of blocks a scanner may hold in its blocksMap without opening
up the chance for beginning this exception (intermittent, as blocks continue to grow) would
be 3582718.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message