crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: org.apache.avro.UnresolvedUnionException
Date Mon, 13 Apr 2015 23:24:35 GMT
Hey Lucy,

I don't grok the last MapFn before the lgr gets written out; it looks like
it's defined over an Iterable<ABCData>, but the map() function defined
inside the class is over Iterable<LingPipeData>. I assume that's the source
of the problem-- the value that is getting printed out is the string form
of a LingPipeData object, which isn't what the system expects to see.

J

On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lucychen2014fall@gmail.com>
wrote:

> Hi,
>
>      I have an exception of org.apache.avro.UnresolvedUnionException
> thrown out by the following codes:
>
>
> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>
> PTable<String, ABCData> ABC = input.mapValues(new
> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>
>
> *******************************************************************************************************
>
>
> PTable<String, String> lgr = ABC.groupByKey().
>
>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>
>                          @Override
>
> public String map(Iterable<LingPipeData> input)
>
> {
>
> Iterator<LingPipeData> ite1 = input.iterator();
>
> int counter=0;
>
> while(ite1.hasNext())
>
> {
>
> counter++;
>
> }
>
>  return Integer.toString(counter);
>
>
>                        }
>
>                           }, Avros.strings());
>
> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>
>
> ****************************************************************************************************************
>
>
> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>
>
> private FeatIndexMapping feat_index_mapping;
>
> private boolean addIntercept;
>
>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
> addIntercept)
>
> {
>
> this.feat_index_mapping = feat_index_mapping;
>
> this.addIntercept = addIntercept;
>
> }
>
>  @Override
>
> public  ABCData map(InputType input)
>
> {
>
> return new ABCData(input, feat_index_mapping, addIntercept);
>
>  }
>
>
> }
>
>
> public class ABCData implements java.io.Serializable, Cloneable{
>
>
> private int label;
>
> private Vector feature;
>
> private int dim;
>
> private final static Logger logger = Logger
>
>        .getLogger(ABCData.class.getName());
>
>         ......
>
> }
>
>
> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
> codes can run well until the line of star. But when the codes include
> ABC.groupByKey().mapValues(), the following exception will be caught. Can
> any one tell me how to solve the problem?
>
>
> Thanks.
>
>
> Lucy
>
>
>  The logs look like:
>
>
> org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>
> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>
> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>
> Caused by: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2 job failure(s) occurred:
>
> (5): Depending job with jobID 1 failed.
>
> com.apple.rsp.CrossValidation.CrossValidationDriver:
> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
> Job failed!
>
>

Mime
View raw message