hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Pol (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8198) Erasure Coding: system test of TeraSort
Date Sat, 04 Nov 2017 13:19:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238998#comment-16238998

Daniel Pol commented on HDFS-8198:

Terasort doesn't seem to work on my system with EC in beta1. Here's a small script to reproduce
the issue:

sudo -u hdfs bin/hdfs dfs -rm -r -skipTrash /ectest
sudo -u hdfs bin/hdfs dfs -mkdir /ectest
#sudo -u hdfs bin/hdfs ec -setPolicy -path /ectest -policy RS-3-2-1024k
sleep 5
sudo -u hdfs bin/yarn jar  /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
teragen 100000000 /ectest/Input
sleep 30
sudo -u hdfs bin/yarn jar  /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
teravalidate /ectest/Input /ectest/Validate
sleep 30
sudo -u hdfs bin/yarn jar  /ec/hadoop-3.0.0-beta1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar
terasort /ectest/Input /ectest/Output

It works fine like this (with the set EC policy commented out) but it fails when you uncomment
the set policy line. Interestingly enough the it fails only at Terasort step when reading
the input files, but Teravalidate that runs before it reads the same files and it doesn't
fail. Fsck shows everything find and checking the nodes individually, all the files are there.
I've tried all default codecs and policies (native and java), they all give me the same error.
Missing blocks. Error shows up only when the amount of data becomes big enough, so make sure
you use the number of records I have in my script or higher.

> Erasure Coding: system test of TeraSort
> ---------------------------------------
>                 Key: HDFS-8198
>                 URL: https://issues.apache.org/jira/browse/HDFS-8198
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Kai Sasaki
>            Priority: Major
> Functional system test of TeraSort on EC files.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message