hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua Harlow (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-708) A stress-test tool for HDFS.
Date Tue, 13 Apr 2010 17:08:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856500#action_12856500

Joshua Harlow commented on HDFS-708:

For the distributions I was thinking that this could occur.
The input would expected to be between [0,1] and output expected to be between [0,1].
The way I was thinking this would work is that the mapper would give the current time and
divide it by the maximum time (both known) and for each iteration of the mapper's inner loop
(the one producing & running operations) it would calculate the distribution using these
simple formulas for each operation type and distribution given. This would then give a list
of numbers between [0,1] which can then be multiplied by a new config variable (slive.ops.per.iteration)
and also multiplied by the operations ratio (percentage) to then determine how many operations
should occur in that iteration. If the total operations after each loop reaches slive.map.ops
or current time reaches the maximum time the loop would stop and the results would be sent
to the reducer.

Here are possible equations to be used:
Beg would be defined by x^2 (having a number approaching 1 at the end)
End would be defined by (x-1)^2 (having a number approach 0 at the end) 
Mid would be defined by -2*(x-1/2)^2+1/2 (having a bell shaped curve)
Uniform would just return 1/3 (the above equations have areas of 1/3 so this seems to make
Suggestions are welcome.

> A stress-test tool for HDFS.
> ----------------------------
>                 Key: HDFS-708
>                 URL: https://issues.apache.org/jira/browse/HDFS-708
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: test, tools
>    Affects Versions: 0.22.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.22.0
>         Attachments: SLiveTest.pdf
> It would be good to have a tool for automatic stress testing HDFS, which would provide
IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to analyze possible

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message