hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terance Dias <terance.d...@gmail.com>
Subject Shuffle Error after enabling Kerberos authentication
Date Sat, 19 Apr 2014 12:32:20 GMT
Hi,

I'm using apache hadoop-2.1.0-beta. I'm able to set up a basic multi-node
cluster and run map reduce jobs. But when I enable Kerberos authentication,
the reduce task fails with following error.

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error
in shuffle in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
bailing-out.
at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:311)
at
org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:243)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:347)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)

I did a search and found that people have generally seen this error when
their network configuration is not correct and so the data nodes are not
able to communicate with each other to shuffle the data. I don't think that
is the problem in my case because everything works fine if Kerberos
authentication is disabled. Any idea what what the problem could be?

Thanks,
Terance.

Mime
View raw message