hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-11552) Erasure Coding: Support Parity Blocks placement onto same nodes hosting Data Blocks when DataNodes are insufficient
Date Wed, 12 Apr 2017 00:47:41 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Wang resolved HDFS-11552.
--------------------------------
    Resolution: Not A Problem

> Erasure Coding: Support Parity Blocks placement onto same nodes hosting Data Blocks when
DataNodes are insufficient
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11552
>                 URL: https://issues.apache.org/jira/browse/HDFS-11552
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have
>
> Currently, {{DFSStripedOutputStream}} verifies if the allocated block locations are at
least numDataBlocks length. That is, for the EC Policy RS-6-3-64K, though the total needed
DNs for a full EC Block Group is 9, Clients will be able to successfully create a DFSStripedOutputStream
with just 6 DNs. Moreover, the output stream thus created with less DNs will totally ignore
writing Parity Blocks.
> {code}
> [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497))
- Failed to get block location for parity block, index=6
> [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497))
- Failed to get block location for parity block, index=7
> [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497))
- Failed to get block location for parity block, index=8
> {code}
> So, upon file stream close we get the following warning message (though not accurate)
when the parity blocks are not yet written out.
> {code}
> INFO  namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2726)) - BLOCK* blk_-9223372036854775792_1002
is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 6) in file /ec/test1
> INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2679)) - DIR* completeFile: /ec/test1
is closed by DFSClient_NONMAPREDUCE_-1900076771_17
> WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:logCorruptBlocks(1117)) - Block
group <1> has 3 corrupt blocks. It's at high risk of losing data.
> {code}
> I am not sure if there are any practical limitations in placing more blocks of a Block
Group onto the same node. At least, we can allow parity blocks co-exist with data blocks,
whenever there are insufficient DNs in the cluster. Later, upon addition of more DataNodes,
the Block Placement Policy can detect the improper placement for such BlockGroups and can
tigger EC reconstruction. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message