hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shady Xu <shad...@gmail.com>
Subject Re: Supervisely, RAID0 provides best io performance whereas no RAID the worst
Date Sun, 31 Jul 2016 03:12:11 GMT
Thanks Andrew, I know about the disk failure risk and that it's one of the
reasons why we should use JBOD. But JBOD provides worse performance than
RAID 0. And take into account the fact that HDFS does have other
replications and it will make one more replication on another DataNode when
disk failure happens. So why should we sacrifice performance to prevent
data loss which can naturally be avoided by HDFS?

2016-07-31 0:36 GMT+08:00 Andrew Wright <agwlists@gmail.com>:

> Yes you are.
>
> If you loose any one of your disks with a raid 0 spanning all drive you
> will loose all the data in that directory.
>
> And disks do die.
>
> Yes you get better single threaded performance but are putting that entire
> directory/data set at higher risk
>
> Cheers
>
>
> On Saturday, July 30, 2016, Shady Xu <shadyxu@gmail.com> wrote:
>
>> Hi,
>>
>> It's widely known that we should mount disks to different directory
>> without any RAID configurations because it provides the best io performance.
>>
>> However, lately I have done some tests with three different
>> configurations and found this may not be the truth. Below are the
>> configurations and statistics shown by command 'iostat -x'.
>>
>> Configuration A: RAID 0 all 12 disks to one directory
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> sdb               0.01     0.59  112.02   65.92 15040.07 15856.86
>> 347.27     0.32    1.81    2.36    0.86   0.93  16.49
>>
>>
>> ------------------------------------------------------------------------------
>>
>> Configuration B: No RAID at all
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> sdc               0.01     0.12    2.88    5.23   364.54  1247.10
>> 397.52     0.76   93.80    9.05  140.42   2.44   1.98
>> sdg               0.01     0.07    2.39    5.27   328.72  1246.51
>> 410.93     0.75   97.88   10.93  137.33   2.63   2.02
>> sdl               0.01     0.07    2.59    5.46   340.61  1299.00
>> 407.00     0.82  102.18    9.64  146.09   2.55   2.05
>> sdf               0.01     0.11    2.28    5.02   291.48  1197.00
>> 407.99     0.72   99.23    9.15  140.12   2.62   1.91
>> sdb               0.01     0.07    2.69    5.23   334.19  1238.20
>> 396.99     0.74   93.84    8.10  137.98   2.41   1.91
>> sde               0.01     0.11    2.81    5.27   376.54  1262.25
>> 405.56     0.79   97.62   10.96  143.84   2.58   2.08
>> sdk               0.01     0.12    3.02    5.20   371.92  1244.48
>> 392.93     0.79   96.07    8.63  146.85   2.48   2.04
>> sda               0.00     0.07    2.82    5.33   370.06  1260.68
>> 400.52     0.78   96.09    9.72  141.74   2.49   2.03
>> sdi               0.01     0.11    3.09    5.30   378.19  1269.98
>> 392.63     0.78   92.47    5.98  142.88   2.31   1.94
>> sdj               0.01     0.07    3.04    5.02   365.32  1185.24
>> 385.01     0.74   92.22    6.31  144.29   2.40   1.93
>> sdh               0.01     0.07    2.74    5.34   356.22  1264.28
>> 401.06     0.78   96.81   11.36  140.75   2.55   2.06
>> sdd               0.01     0.11    2.47    5.39   343.22  1292.23
>> 416.20     0.76   96.48   10.26  135.96   2.54   1.99
>>
>>
>> ------------------------------------------------------------------------------
>>
>> Configuration C: RAID 0 each 12 disks to 12 different directories
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> sdd               0.00     0.10    8.88    7.42  1067.65  1761.12
>> 346.94     0.13    7.94    3.64   13.09   0.46   0.75
>> sdb               0.00     0.09    8.83    7.52  1066.16  1784.79
>> 348.65     0.13    8.02    3.75   13.02   0.47   0.76
>> sdc               0.00     0.10    8.82    7.48  1073.74  1776.02
>> 349.61     0.13    8.09    3.76   13.19   0.47   0.76
>> sde               0.00     0.10    8.74    7.46  1060.79  1771.46
>> 349.63     0.13    7.80    3.53   12.81   0.45   0.73
>> sdg               0.00     0.10    8.93    7.46  1101.14  1772.73
>> 350.64     0.13    7.81    3.70   12.71   0.47   0.77
>> sdf               0.00     0.09    8.75    7.46  1062.06  1772.08
>> 349.73     0.13    8.03    3.78   13.00   0.46   0.75
>> sdh               0.00     0.10    9.09    7.45  1114.94  1770.07
>> 348.76     0.13    7.83    3.69   12.89   0.47   0.77
>> sdi               0.00     0.10    8.91    7.43  1086.85  1761.30
>> 348.48     0.13    7.93    3.64   13.07   0.46   0.75
>> sdj               0.00     0.10    9.04    7.46  1111.32  1768.79
>> 349.15     0.13    7.79    3.64   12.82   0.46   0.76
>> sdk               0.00     0.10    9.12    7.51  1122.00  1783.41
>> 349.49     0.13    7.82    3.72   12.80   0.48   0.79
>> sdl               0.00     0.10    8.91    7.49  1087.98  1777.77
>> 349.49     0.13    7.89    3.69   12.89   0.46   0.75
>> sdm               0.00     0.09    8.97    7.52  1098.82  1787.10
>> 349.95     0.13    7.96    3.79   12.94   0.47   0.78
>>
>> It seems the raid 0 all disks to one directory configuration is the one
>> which provides the best disk performance and the no raid configuration
>> provides the worst performance. That's a total opposite fact comparing to
>> the widely known one. Am I doing anything wrong?
>>
>

Mime
View raw message