hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "leiwangouc@gmail.com" <leiwang...@gmail.com>
Subject Pig: java.lang.String cannot be cast to org.apache.pig.data.DataBag in specified map task
Date Tue, 15 Apr 2014 04:23:51 GMT

Hi, 

   I am using cloudera and  run mapreduce job written with pig latin,  I met the following
exception in a map task: 
014-04-15 11:30:39,532 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.DataBag
	at org.apache.pig.builtin.Distinct.getDistinctFromNestedBags(Distinct.java:140)
	at org.apache.pig.builtin.Distinct.access$100(Distinct.java:39)
	at org.apache.pig.builtin.Distinct$Intermediate.exec(Distinct.java:101)
	at org.apache.pig.builtin.Distinct$Intermediate.exec(Distinct.java:94)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:376)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:354)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:220)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:210)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:185)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:163)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:51)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
	at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1477)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1587)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1199)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:609)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:675)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)
By looking up the staketrace i think the exception is throw here:  
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.pig/pig/0.11.0-cdh4.3.1/org/apache/pig/builtin/Distinct.java
 line 140

However,  the second retry of this  map task succeed. They are using exactly the same data
and same code. This really confuse me.

Any insight about this?

Thanks,
Lei
 


leiwangouc@gmail.com
Mime
View raw message