hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gruust (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12649) handling of corrupt blocks not suitable for commodity hardware
Date Thu, 12 Oct 2017 20:50:01 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gruust updated HDFS-12649:
--------------------------
    Description: 
Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware
failures are expected to happen frequently. However, there is currently no automatic handling
of corrupted blocks, which seems a bit contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at the desired level
without manual intervention and therefore in a timely manner. If there is a corrupted block,
I would at least expect that the namenode forces the creation of an additional good replica
to keep up the redundancy level, ie. the redundancy level should never include corrupted data...
which it currently does:

    "UnderReplicatedBlocks" : 0,
    "CorruptBlocks" : 2,

(namenode /jmx http dump)

  was:
Hadoop's documentation tells me it's suitable for commodity hardware in the sense that hardware
failures are expected to happen frequently. However, there is currently no automatic handling
of corrupted blocks, which seems a bit contradictory to me.

See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files

This is even problematic for data integrity as the redundancy is not kept at the desired level
without manual intervention and therefore in a timely manner. If there is a corrupted block,
I would at least expect that the namenode forces the creation of an additional good replica
to keep up the redundancy level. 


> handling of corrupt blocks not suitable for commodity hardware
> --------------------------------------------------------------
>
>                 Key: HDFS-12649
>                 URL: https://issues.apache.org/jira/browse/HDFS-12649
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.8.1
>            Reporter: Gruust
>            Priority: Minor
>
> Hadoop's documentation tells me it's suitable for commodity hardware in the sense that
hardware failures are expected to happen frequently. However, there is currently no automatic
handling of corrupted blocks, which seems a bit contradictory to me.
> See: https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files
> This is even problematic for data integrity as the redundancy is not kept at the desired
level without manual intervention and therefore in a timely manner. If there is a corrupted
block, I would at least expect that the namenode forces the creation of an additional good
replica to keep up the redundancy level, ie. the redundancy level should never include corrupted
data... which it currently does:
>     "UnderReplicatedBlocks" : 0,
>     "CorruptBlocks" : 2,
> (namenode /jmx http dump)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message