hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9818) Correctly handle EC reconstruction work caused by not enough racks
Date Thu, 18 Feb 2016 22:25:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153241#comment-15153241
] 

Tsz Wo Nicholas Sze commented on HDFS-9818:
-------------------------------------------

- We may also make chooseSource4SimpleReplication static and slightly shorter.
{code}
  private static int chooseSource4SimpleReplication(DatanodeDescriptor[] dds) {
    final Map<String, List<Integer>> map = new HashMap<>();
    for (int i = 0; i < dds.length; i++) {
      final String rack = dds[i].getNetworkLocation();
      List<Integer> list = map.get(rack);
      if (list == null) {
        list = new ArrayList<>();
        map.put(rack, list);
      }
      list.add(i);
    }

    List<Integer> max = null;
    for (Map.Entry<String, List<Integer>> entry : map.entrySet()) {
      if (max == null || entry.getValue().size() > max.size()) {
        max = entry.getValue();
      }
    }
    return max.get(0);
  }
{code}

> Correctly handle EC reconstruction work caused by not enough racks
> ------------------------------------------------------------------
>
>                 Key: HDFS-9818
>                 URL: https://issues.apache.org/jira/browse/HDFS-9818
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: 3.0.0
>            Reporter: Takuya Fukudome
>            Assignee: Jing Zhao
>         Attachments: HDFS-9818.000.patch, HDFS-9818.001.patch
>
>
> This is reported by [~tfukudom]:
> In a system test where 1 of 7 datanode racks were stopped, {{HadoopIllegalArgumentException}}
was seen on DataNode side while reconstructing missing EC blocks:
> {code}
> 2016-02-16 11:09:06,672 WARN  datanode.DataNode (ErasureCodingWorker.java:run(482)) -
Failed to recover striped block: BP-480558282-172.29.4.13-1453805190696:blk_-9223372036850962784_278270
> org.apache.hadoop.HadoopIllegalArgumentException: Inputs not fully corresponding to erasedIndexes
in null places. erasedOrNotToReadIndexes: [1, 2, 6], erasedIndexes: [3]
> 	at org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.doDecode(RSRawDecoder.java:166)
> 	at org.apache.hadoop.io.erasurecode.rawcoder.AbstractRawErasureDecoder.decode(AbstractRawErasureDecoder.java:84)
> 	at org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.decode(RSRawDecoder.java:89)
> 	at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.recoverTargets(ErasureCodingWorker.java:683)
> 	at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:465)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message