hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brahma Reddy Battula (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7896) HDFS Slow disk detection
Date Tue, 27 Oct 2015 06:40:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975816#comment-14975816

Brahma Reddy Battula commented on HDFS-7896:

A slow disk checker script should be well designed. 

Firstly, it should take storage types into consideration.And should we measure read/write
throughout, or iops? 

Cache/free memory affects the results. 
What if the disk is currently in heavy load?  
What's more, a script may not run in all environment. 

Anyway, provide a default script is not a good idea.Re-invent the wheel also not a good idea.

There exists some benchmark tools

The tools give you a result number. But what's the threshold of a "slow" disk?

What I'm thinking is, we don't write the script. The script is not used for running benchmark.
Instead, We get the result by script from somewhere, the result must be prepared by some other
daemon in advance. 

Running benchmark at startup could spend much time, although we can print the interactive
feedback. But I prefer to detect the slow disk periodically. Some other daemon can periodically
refresh the benchmark results, and feed HDFS the results. The daemon can run benchmark on
some disk if the disk is light load. Some other information like bad sector numbers can be
checked more often.

how do you think?

> HDFS Slow disk detection
> ------------------------
>                 Key: HDFS-7896
>                 URL: https://issues.apache.org/jira/browse/HDFS-7896
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Arpit Agarwal
>         Attachments: HDFS-7896.00.patch
> HDFS should detect slow disks. To start with we can flag this information via the NameNode
web UI. Alternatively DNs can avoid using slow disks for writes.

This message was sent by Atlassian JIRA

View raw message