Return-Path: X-Original-To: apmail-crunch-user-archive@www.apache.org Delivered-To: apmail-crunch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 02DEC17589 for ; Tue, 14 Apr 2015 01:38:19 +0000 (UTC) Received: (qmail 48352 invoked by uid 500); 14 Apr 2015 01:38:15 -0000 Delivered-To: apmail-crunch-user-archive@crunch.apache.org Received: (qmail 48311 invoked by uid 500); 14 Apr 2015 01:38:15 -0000 Mailing-List: contact user-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@crunch.apache.org Delivered-To: mailing list user@crunch.apache.org Received: (qmail 48301 invoked by uid 99); 14 Apr 2015 01:38:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2015 01:38:15 +0000 X-ASF-Spam-Status: No, hits=3.8 required=5.0 tests=HTML_MESSAGE,MANY_SPAN_IN_TEXT,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jwills@cloudera.com designates 209.85.220.178 as permitted sender) Received: from [209.85.220.178] (HELO mail-qk0-f178.google.com) (209.85.220.178) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2015 01:38:11 +0000 Received: by qku63 with SMTP id 63so211400013qku.3 for ; Mon, 13 Apr 2015 18:37:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=2FPVgW4FMB/5F9jrOThaksHm8J0JLHj66DuFABtvUg0=; b=OR7HewO44asZlo4APkNhQJEC+3Dh3vOfv7W7dGnNYyGVgDQJvzr3O0kuulcmdAXu8t Z92h11c2t0rHMcYiuEX5np30Vnwr0hhwzhbSXpoAE1owpAmstoBLgG/1Po6VUeBj8/M5 MTBMtVaXNWCTDL+qOzjTdl2f17Y9rmZP5UMqiq6T8RS7P3qG7W+WAdk/6qAZDs2TUXsh sxEOOZS9HommjoRWR1GZGCV/8d5UQXU1OKs49VHDlz97hAtJ7NGnajYYeKw2ubUNbaKC Y4OrVh5wmy/hTQJBXalr7nHAI4HyVge9sqDfv19NGqRo8KQ7zZt4tSoy7DzQ6A2ZVwIk tdkA== X-Gm-Message-State: ALoCoQkgj2Fna8i1gqiSxNuFIaIZvb8en6qxPR0XicgFodcCRR0pRub5pjBwPu9HegyTqBsi+wB7 X-Received: by 10.55.41.93 with SMTP id p90mr35123677qkh.98.1428975470235; Mon, 13 Apr 2015 18:37:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.167.214 with HTTP; Mon, 13 Apr 2015 18:37:29 -0700 (PDT) In-Reply-To: References: From: Josh Wills Date: Mon, 13 Apr 2015 21:37:29 -0400 Message-ID: Subject: Re: org.apache.avro.UnresolvedUnionException To: user@crunch.apache.org Cc: Josh Wills Content-Type: multipart/alternative; boundary=001a114967804be9a20513a547af X-Virus-Checked: Checked by ClamAV on apache.org --001a114967804be9a20513a547af Content-Type: text/plain; charset=UTF-8 Oh, okay. Would you humor me and try to use parallelDo(...) instead of mapValues(...) after the groupByKey() call and see if that works? I have this weird feeling that mapValues is doing something it shouldn't be doing to the Iterable. J On Mon, Apr 13, 2015 at 9:34 PM, Lucy Chen wrote: > Hi Josh, > > Thanks for your quick response. The codes should be as follows. I > just renamed the LingPipeData and copied the codes in the email, I forgot > to change a couple of places. I just simply change LingPipeData to ABCData > to make it easier for you to understand. Here I used LingPipe package > inside my Crunch jobs. I doubt whether the Vector included in the ABCData > caused some troubles when it was serialized by an Avro type. However, when > the codes exclude the parts after "*******" and just write ABC as an > output. It worked fine; but after adding ABC.groupByKey().mapValues(...), > it throws the exception. > > Sorry about the typos in my last email. > > Thanks! > > Lucy > > PType ABCDataType = Avros.records(ABCData.class); > > PTable ABC = input.mapValues(new ConvertToABCData(feat_index_mapping, > addIntercept), ABCDataType); > > > ******************************************************************************************************* > > > PTable lgr = ABC.groupByKey(). > > mapValues(new MapFn, String> { > > @Override > > public String map(Iterable input) > > { > > Iterator ite1 = input.iterator(); > > int counter=0; > > while(ite1.hasNext()) > > { > > counter++; > > } > > return Integer.toString(counter); > > > } > > }, Avros.strings()); > > lgr.write(At.textFile(output_path), WriteMode.OVERWRITE); > > > **************************************************************************************************************** > > > public class ConvertToABCData extends MapFn{ > > > private FeatIndexMapping feat_index_mapping; > > private boolean addIntercept; > > public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean > addIntercept) > > { > > this.feat_index_mapping = feat_index_mapping; > > this.addIntercept = addIntercept; > > } > > @Override > > public ABCData map(InputType input) > > { > > return new ABCData(input, feat_index_mapping, addIntercept); > > } > > > } > > > public class ABCData implements java.io.Serializable, Cloneable{ > > > private int label; > > private Vector feature; > > private int dim; > > private final static Logger logger = Logger > > .getLogger(ABCData.class.getName()); > > ...... > > } > > > On Mon, Apr 13, 2015 at 4:24 PM, Josh Wills wrote: > >> Hey Lucy, >> >> I don't grok the last MapFn before the lgr gets written out; it looks >> like it's defined over an Iterable, but the map() function defined >> inside the class is over Iterable. I assume that's the source >> of the problem-- the value that is getting printed out is the string form >> of a LingPipeData object, which isn't what the system expects to see. >> >> J >> >> On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen >> wrote: >> >>> Hi, >>> >>> I have an exception of org.apache.avro.UnresolvedUnionException >>> thrown out by the following codes: >>> >>> >>> PType ABCDataType = Avros.records(ABCData.class); >>> >>> PTable ABC = input.mapValues(new >>> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType); >>> >>> >>> ******************************************************************************************************* >>> >>> >>> PTable lgr = ABC.groupByKey(). >>> >>> mapValues(new MapFn, String> { >>> >>> @Override >>> >>> public String map(Iterable input) >>> >>> { >>> >>> Iterator ite1 = input.iterator(); >>> >>> int counter=0; >>> >>> while(ite1.hasNext()) >>> >>> { >>> >>> counter++; >>> >>> } >>> >>> return Integer.toString(counter); >>> >>> >>> } >>> >>> }, Avros.strings()); >>> >>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE); >>> >>> >>> **************************************************************************************************************** >>> >>> >>> public class ConvertToABCData extends MapFn{ >>> >>> >>> private FeatIndexMapping feat_index_mapping; >>> >>> private boolean addIntercept; >>> >>> public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean >>> addIntercept) >>> >>> { >>> >>> this.feat_index_mapping = feat_index_mapping; >>> >>> this.addIntercept = addIntercept; >>> >>> } >>> >>> @Override >>> >>> public ABCData map(InputType input) >>> >>> { >>> >>> return new ABCData(input, feat_index_mapping, addIntercept); >>> >>> } >>> >>> >>> } >>> >>> >>> public class ABCData implements java.io.Serializable, Cloneable{ >>> >>> >>> private int label; >>> >>> private Vector feature; >>> >>> private int dim; >>> >>> private final static Logger logger = Logger >>> >>> .getLogger(ABCData.class.getName()); >>> >>> ...... >>> >>> } >>> >>> >>> Here Vector is defined from third party: com.aliasi.matrix.Vector; The >>> codes can run well until the line of star. But when the codes include >>> ABC.groupByKey().mapValues(), the following exception will be caught. Can >>> any one tell me how to solve the problem? >>> >>> >>> Thanks. >>> >>> >>> Lucy >>> >>> >>> The logs look like: >>> >>> >>> org.apache.crunch.CrunchRuntimeException: >>> org.apache.avro.file.DataFileWriter$AppendWriteException: >>> org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at >>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at >>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14) >>> >>> at >>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66) >>> >>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79) >>> >>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113) >>> >>> at >>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57) >>> >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) >>> >>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) >>> >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) >>> >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) >>> >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> at java.lang.Thread.run(Thread.java:744) >>> >>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: >>> org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263) >>> >>> at >>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87) >>> >>> at >>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84) >>> >>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133) >>> >>> at >>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41) >>> >>> ... 28 more >>> >>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) >>> >>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257) >>> >>> ... 32 more >>> >>> 2015-04-13 15:49:16,876 INFO [Thread-500] mapred.LocalJobRunner >>> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete. >>> >>> 2015-04-13 15:49:16,879 WARN [Thread-500] mapred.LocalJobRunner >>> (LocalJobRunner.java:run(560)) - job_local918028004_0008 >>> >>> java.lang.Exception: org.apache.crunch.CrunchRuntimeException: >>> org.apache.avro.file.DataFileWriter$AppendWriteException: >>> org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) >>> >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) >>> >>> Caused by: org.apache.crunch.CrunchRuntimeException: >>> org.apache.avro.file.DataFileWriter$AppendWriteException: >>> org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at >>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at >>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14) >>> >>> at >>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66) >>> >>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79) >>> >>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at >>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56) >>> >>> at org.apache.crunch.MapFn.process(MapFn.java:34) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98) >>> >>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113) >>> >>> at >>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57) >>> >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171) >>> >>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627) >>> >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) >>> >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) >>> >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> >>> at java.lang.Thread.run(Thread.java:744) >>> >>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: >>> org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263) >>> >>> at >>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87) >>> >>> at >>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84) >>> >>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133) >>> >>> at >>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41) >>> >>> ... 28 more >>> >>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union >>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]: >>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079 >>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858 >>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264 >>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319 >>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466 >>> >>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) >>> >>> at >>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104) >>> >>> at >>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) >>> >>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257) >>> >>> ... 32 more >>> >>> 2 job failure(s) occurred: >>> >>> (5): Depending job with jobID 1 failed. >>> >>> com.apple.rsp.CrossValidation.CrossValidationDriver: >>> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1): >>> Job failed! >>> >>> >> > -- Director of Data Science Cloudera Twitter: @josh_wills --001a114967804be9a20513a547af Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Oh, okay. Would you humor me and try to use parallelDo(...= ) instead of mapValues(...) after the groupByKey() call and see if that wor= ks? I have this weird feeling that mapValues is doing something it shouldn&= #39;t be doing to the Iterable.

J

On Mon, Apr 13, 2015 at 9:3= 4 PM, Lucy Chen <lucychen2014fall@gmail.com> wrote:=
Hi Josh,

=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Thanks for your quick response. The = codes should be as follows. I just renamed the LingPipeData and copied the = codes in the email, I forgot to change a couple of places. I just simply ch= ange LingPipeData to ABCData to make it easier for you to understand. Here = I used LingPipe package inside my Crunch jobs. I doubt whether the Vector i= ncluded in the ABCData caused some troubles when it was serialized by an Av= ro type. However, when the codes exclude the parts after "*******"= ; and just write ABC as an output. It worked fine; but after adding ABC.gro= upByKey().mapValues(...), it throws the exception.=C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Sorry about the typos in my last = email.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Thanks!<= /div>

Lucy

PType<ABCDat= a> ABCDataType =3D Avros.records(ABCData.class);


=

PTable<String,= ABCData> ABC =3D input.mapValues(= new=C2=A0ConvertToABCData(feat_index_mapping, addIntercept), ABCData= Type);

*******= ***************************************************************************= *********************


P= Table<String, String> lgr =3D ABC.groupByKey().

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mapValues(new=C2=A0MapFn<Iterable<ABCData>, Stri= ng> {

=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0@Override

public=C2=A0String map(Iterable<ABCData> input)

{

= Iterator<ABCData> ite1 =3D input.iterator();

int=C2=A0counter=3D0;

while(ite1.hasNext())

{

counter+= +;

}

return=C2=A0Integer.toString(counter);


=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }, Avros.string= s());

lgr.writ= e(At.textFile(output_path), WriteMode.O= VERWRITE);

***********************************************************************= *****************************************


public=C2= =A0class=C2=A0ConvertToABCData=C2=A0extends=C2=A0MapFn<InputType, ABCData>{

<= p style=3D"margin:0px;font-size:18px;font-family:Monaco;min-height:25px">

pri= vate=C2=A0FeatIndexMapping=C2=A0= feat_index_mapping;

private=C2=A0boolean=C2=A0addIntercept;

<= p style=3D"margin:0px;font-size:18px;font-family:Monaco"> public=C2=A0ConvertToABCData(FeatIndexMapping feat_index_mapping,=C2=A0booleanaddIntercept)

{

this.feat_index_mapp= ing=C2=A0=3D feat_index_mapping;

this.addIntercept=C2=A0=3D addIntercept;

= }

@Override

public=C2=A0 ABCData map(InputType input)

{

return=C2=A0new=C2=A0ABCData(inpu= t,=C2=A0feat_index_mapping,=C2=A0addIntercept);

}


}


public=C2=A0class=C2=A0ABCData=C2=A0implements=C2=A0java.io.Serializable, Cloneable{


private<= span style=3D"color:rgb(0,0,0)">=C2=A0int=C2=A0label;

private=C2=A0Vector=C2=A0feature;

private=C2=A0int=C2= =A0dim;

private=C2=A0= final=C2=A0static=C2=A0= Logger=C2=A0l= ogger=C2=A0=3D Logger

=C2=A0 =C2=A0 =C2=A0 .getLo= gger(ABCData.class.getName());=

=C2=A0 =C2=A0= =C2=A0 =C2=A0 ......

}



On Mon, Apr 13, 2015 a= t 4:24 PM, Josh Wills <josh.wills@gmail.com> wrote:
Hey Lucy,

I don't gr= ok the last MapFn before the lgr gets written out; it looks like it's d= efined over an Iterable<ABCData>, but the map() function defined insi= de the class is over Iterable<LingPipeData>. I assume that's the = source of the problem-- the value that is getting printed out is the string= form of a LingPipeData object, which isn't what the system expects to = see.

J
<= /span>

On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lucychen2014fall= @gmail.com> wrote:
Hi,
=C2=A0 =C2=A0 =C2=A0I have an exception of=C2=A0org.apach= e.avro.UnresolvedUnionException thrown out by the following codes:


PType<ABCData> ABCDataType =3D Avros.records(ABCData.<= span style=3D"color:rgb(147,26,104)">class);


PTable<S= tring, ABCData> ABC =3D input.mapValues(new ConvertToABCData(feat_index_mapping, addIntercept), ABCDat= aType);

******= ***************************************************************************= **********************


= PTable<String, String> lgr =3D ABC.groupByKey().

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mapValues(new=C2=A0MapFn<Iterable<ABCData>, Str= ing> {

=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0@Override

public String map(Iterable<LingPipeData> input)

{

Iterator<LingPipeData> ite1 =3D in= put.iterator();

i= nt counter=3D0;

= while(ite1.hasNext())

{

counter++;

}

<= span style=3D"white-space:pre-wrap">

return Integer.toString(counter);


=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0}

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }, Avros.strings());

lgr.write(At.textFile(output_path), = WriteMode.OVERWRITE);

************************= ***************************************************************************= *************

=

public class ConvertToABCDa= ta extends MapFn<Inp= utType, ABCData>{


private FeatIndexMapping feat_index_mapping;

private boolean <= /span>addIntercept;

=

pub= lic ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean addIntercept)

{

this.feat_index_mappin= g =3D feat_index_mapping;

this.addIntercept =3D addIntercept;

}

=

<= span style=3D"white-space:pre-wrap">

@Override

public<= /span>=C2=A0 ABCData map(InputType input)

{<= /p>

return new= =C2=A0ABCData(input, feat_in= dex_mapping, addIntercept);

<= /span>

}


}


public class=C2=A0A= BCData implements java.io.Serializable, Cloneable{

private int la= bel;

= private Vector feature;

private int dim;

private final static Logger logger =3D Logger=

=C2=A0 =C2=A0 =C2=A0 .getLogger(ABCData.= class.getName());

=C2=A0 =C2=A0 =C2=A0 =C2= =A0 ......

}


Here Vector is defined f= rom third party: com.aliasi.matrix.Vector; The codes can run well until the= line of star. But when the codes include ABC.groupByKey().mapValues(), the= following exception will be caught. Can any one tell me how to solve the p= roblem?


Thanks.


Lucy


The logs look li= ke:

org.apache.crunch.CrunchRuntimeEx= ception: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apac= he.avro.UnresolvedUnionException: Not in union ["null",{"typ= e":"record","name":"Vector","namesp= ace":"com.aliasi.matrix","fields":[]}]: 0=3D1.0 8= =3D0.0917 9=3D0.0734 14=3D0.0336 22=3D0.0485 36=3D0.0795 40=3D0.0611 59=3D0= .079 101=3D0.1065 127=3D0.1101 131=3D0.0969 135=3D0.1016 151=3D0.079 154=3D= 0.1847 177=3D0.0858 199=3D0.1131 200=3D0.0485 269=3D0.1096 271=3D0.1275 335= =3D0.1299 588=3D0.165 799=3D0.2264 1200=3D0.1321 1286=3D0.2796 1482=3D0.129= 9 1702=3D0.4409 2170=3D0.2236 3644=3D0.2319 4824=3D0.2624 5040=3D0.3815 558= 4=3D0.2258 5937=3D0.2466

at org.apache.crunch.impl.mr.emit.Multip= leOutputEmitter.emit(MultipleOutputEmitter.java:45)

at org.apache= .crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr= .run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.= emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org= .apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.= impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.i= mpl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.j= ava:14)

at com.apple.rsp.Utils.RetrieveDataFromJoin.process(Retri= eveDataFromJoin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.= process(RTNode.java:98)

at org.apache.crunch.impl.mr.emit.Interme= diateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crun= ch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.c= runch.lib.join.JoinFn.process(JoinFn.java:79)

at org.apache.crunc= h.lib.join.JoinFn.process(JoinFn.java:32)

at org.apache.crunch.im= pl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.imp= l.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

a= t org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.cr= unch.impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.cru= nch.impl.mr.run.RTNode.processIterable(RTNode.java:113)

at org.ap= ache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

a= t org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

= at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

=

= at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.ru= n(LocalJobRunner.java:319)

at java.util.concurrent.Executors$Runn= ableAdapter.call(Executors.java:471)

at java.util.concurrent.Futu= reTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoo= lExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.co= ncurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.file.DataFileWriter$AppendWriteExceptio= n: org.apache.avro.UnresolvedUnionException: Not in union ["null"= ,{"type":"record","name":"Vector",&= quot;namespace":"com.aliasi.matrix","fields":[]}]:= 0=3D1.0 8=3D0.0917 9=3D0.0734 14=3D0.0336 22=3D0.0485 36=3D0.0795 40=3D0.0= 611 59=3D0.079 101=3D0.1065 127=3D0.1101 131=3D0.0969 135=3D0.1016 151=3D0.= 079 154=3D0.1847 177=3D0.0858 199=3D0.1131 200=3D0.0485 269=3D0.1096 271=3D= 0.1275 335=3D0.1299 588=3D0.165 799=3D0.2264 1200=3D0.1321 1286=3D0.2796 14= 82=3D0.1299 1702=3D0.4409 2170=3D0.2236 3644=3D0.2319 4824=3D0.2624 5040=3D= 0.3815 5584=3D0.2258 5937=3D0.2466

= at org.apache.avro.file.DataFi= leWriter.append(DataFileWriter.java:263)

at org.apache.crunch.typ= es.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)

at org= .apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84= )

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:= 133)

at org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit= (MultipleOutputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not= in union ["null",{"type":"record","name= ":"Vector","namespace":"com.aliasi.matrix&quo= t;,"fields":[]}]: 0=3D1.0 8=3D0.0917 9=3D0.0734 14=3D0.0336 22=3D= 0.0485 36=3D0.0795 40=3D0.0611 59=3D0.079 101=3D0.1065 127=3D0.1101 131=3D0= .0969 135=3D0.1016 151=3D0.079 154=3D0.1847 177=3D0.0858 199=3D0.1131 200= =3D0.0485 269=3D0.1096 271=3D0.1275 335=3D0.1299 588=3D0.165 799=3D0.2264 1= 200=3D0.1321 1286=3D0.2796 1482=3D0.1299 1702=3D0.4409 2170=3D0.2236 3644= =3D0.2319 4824=3D0.2624 5040=3D0.3815 5584=3D0.2258 5937=3D0.2466

at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)<= /p>

at org.apache.avro.generic.GenericDatumWriter.resolveUnion(Generi= cDatumWriter.java:144)

at org.apache.avro.generic.GenericDatumWr= iter.write(GenericDatumWriter.java:71)

at org.apache.avro.reflect= .ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at org.apa= che.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106= )

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatu= mWriter.java:66)

at org.apache.avro.reflect.ReflectDatumWriter.wr= ite(ReflectDatumWriter.java:104)

at org.apache.avro.generic.Gener= icDatumWriter.write(GenericDatumWriter.java:73)

at org.apache.avr= o.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

a= t org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter= .java:106)

at org.apache.avro.generic.GenericDatumWriter.write(Ge= nericDatumWriter.java:66)

at org.apache.avro.reflect.ReflectDatum= Writer.write(ReflectDatumWriter.java:104)

at org.apache.avro.gene= ric.GenericDatumWriter.write(GenericDatumWriter.java:58)

at org.a= pache.avro.file.DataFileWriter.append(DataFileWriter.java:257)

.= .. 32 more

2015-04-13 15:49:16,876 = INFO=C2=A0 [Thread-500] mapred.LocalJobRunner (LocalJobRunner.java:runTasks= (456)) - reduce task executor complete.

2015-04-13 15:49:16,879 WARN=C2=A0 [Thread-500] mapred.LocalJobRunner = (LocalJobRunner.java:run(560)) - job_local918028004_0008

java.lang.Exception: org.apache.crunch.CrunchRuntimeE= xception: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apa= che.avro.UnresolvedUnionException: Not in union ["null",{"ty= pe":"record","name":"Vector","names= pace":"com.aliasi.matrix","fields":[]}]: 0=3D1.0 8= =3D0.0917 9=3D0.0734 14=3D0.0336 22=3D0.0485 36=3D0.0795 40=3D0.0611 59=3D0= .079 101=3D0.1065 127=3D0.1101 131=3D0.0969 135=3D0.1016 151=3D0.079 154=3D= 0.1847 177=3D0.0858 199=3D0.1131 200=3D0.0485 269=3D0.1096 271=3D0.1275 335= =3D0.1299 588=3D0.165 799=3D0.2264 1200=3D0.1321 1286=3D0.2796 1482=3D0.129= 9 1702=3D0.4409 2170=3D0.2236 3644=3D0.2319 4824=3D0.2624 5040=3D0.3815 558= 4=3D0.2258 5937=3D0.2466

at org.apache.hadoop.mapred.LocalJobRunn= er$Job.runTasks(LocalJobRunner.java:462)

at org.apache.hadoop.map= red.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: org.apache.crunch.CrunchRuntimeException: org.a= pache.avro.file.DataFileWriter$AppendWriteException: org.apache.avro.Unreso= lvedUnionException: Not in union ["null",{"type":"= record","name":"Vector","namespace":&quo= t;com.aliasi.matrix","fields":[]}]: 0=3D1.0 8=3D0.0917 9=3D0= .0734 14=3D0.0336 22=3D0.0485 36=3D0.0795 40=3D0.0611 59=3D0.079 101=3D0.10= 65 127=3D0.1101 131=3D0.0969 135=3D0.1016 151=3D0.079 154=3D0.1847 177=3D0.= 0858 199=3D0.1131 200=3D0.0485 269=3D0.1096 271=3D0.1275 335=3D0.1299 588= =3D0.165 799=3D0.2264 1200=3D0.1321 1286=3D0.2796 1482=3D0.1299 1702=3D0.44= 09 2170=3D0.2236 3644=3D0.2319 4824=3D0.2624 5040=3D0.3815 5584=3D0.2258 59= 37=3D0.2466

at org.apache.crunch.impl.mr.emit.MultipleOutputEmitt= er.emit(MultipleOutputEmitter.java:45)

at org.apache.crunch.MapFn= .process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.p= rocess(RTNode.java:98)

at org.apache.crunch.impl.mr.emit.Interme= diateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crun= ch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.= RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.emit.= IntermediateEmitter.emit(IntermediateEmitter.java:56)

at com.appl= e.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)

<= p style=3D"margin:0px;font-size:18px;font-family:Monaco;color:rgb(226,226,2= 26);background-color:rgb(13,13,13)"> <= /span>at com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJ= oin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNo= de.java:98)

at org.apache.crunch.impl.mr.emit.IntermediateEmitter= .emit(IntermediateEmitter.java:56)

= at org.apache.crunch.lib.join.= InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.crunch.lib.jo= in.JoinFn.process(JoinFn.java:79)

<= span style=3D"white-space:pre-wrap"> at org.apache.crunch.lib.join.J= oinFn.process(JoinFn.java:32)

at org.apache.crunch.impl.mr.run.RT= Node.process(RTNode.java:98)

at org.apache.crunch.impl.mr.emit.In= termediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache= .crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr= .run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.= run.RTNode.processIterable(RTNode.java:113)

at org.apache.crunch.= impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.ap= ache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

at org.apache= .hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

at o= rg.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

at or= g.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRu= nner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.= call(Executors.java:471)

at java.util.concurrent.FutureTask.run(F= utureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.ru= nWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.Thr= eadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.l= ang.Thread.run(Thread.java:744)

Cau= sed by: org.apache.avro.file.DataFileWriter$AppendWriteException: org.apach= e.avro.UnresolvedUnionException: Not in union ["null",{"type= ":"record","name":"Vector","namespa= ce":"com.aliasi.matrix","fields":[]}]: 0=3D1.0 8= =3D0.0917 9=3D0.0734 14=3D0.0336 22=3D0.0485 36=3D0.0795 40=3D0.0611 59=3D0= .079 101=3D0.1065 127=3D0.1101 131=3D0.0969 135=3D0.1016 151=3D0.079 154=3D= 0.1847 177=3D0.0858 199=3D0.1131 200=3D0.0485 269=3D0.1096 271=3D0.1275 335= =3D0.1299 588=3D0.165 799=3D0.2264 1200=3D0.1321 1286=3D0.2796 1482=3D0.129= 9 1702=3D0.4409 2170=3D0.2236 3644=3D0.2319 4824=3D0.2624 5040=3D0.3815 558= 4=3D0.2258 5937=3D0.2466

at org.apache.avro.file.DataFileWriter.a= ppend(DataFileWriter.java:263)

at org.apache.crunch.types.avro.Av= roOutputFormat$1.write(AvroOutputFormat.java:87)

at org.apache.cr= unch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)

at org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleO= utputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not in unio= n ["null",{"type":"record","name":&= quot;Vector","namespace":"com.aliasi.matrix","= ;fields":[]}]: 0=3D1.0 8=3D0.0917 9=3D0.0734 14=3D0.0336 22=3D0.0485 3= 6=3D0.0795 40=3D0.0611 59=3D0.079 101=3D0.1065 127=3D0.1101 131=3D0.0969 13= 5=3D0.1016 151=3D0.079 154=3D0.1847 177=3D0.0858 199=3D0.1131 200=3D0.0485 = 269=3D0.1096 271=3D0.1275 335=3D0.1299 588=3D0.165 799=3D0.2264 1200=3D0.13= 21 1286=3D0.2796 1482=3D0.1299 1702=3D0.4409 2170=3D0.2236 3644=3D0.2319 48= 24=3D0.2624 5040=3D0.3815 5584=3D0.2258 5937=3D0.2466

at org.apac= he.avro.generic.GenericData.resolveUnion(GenericData.java:561)

a= t org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWrite= r.java:144)

at org.apache.avro.generic.GenericDatumWriter.write(G= enericDatumWriter.java:71)

at org.apache.avro.reflect.ReflectDatu= mWriter.write(ReflectDatumWriter.java:104)

at org.apache.avro.gen= eric.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

= at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java= :66)

at org.apache.avro.reflect.ReflectDatumWriter.write(ReflectD= atumWriter.java:104)

at org.apache.avro.generic.GenericDatumWrite= r.write(GenericDatumWriter.java:73)

at org.apache.avro.reflect.Re= flectDatumWriter.write(ReflectDatumWriter.java:104)

at org.apache= .avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWr= iter.java:66)

at org.apache.avro.reflect.ReflectDatumWriter.write= (ReflectDatumWriter.java:104)

at org.apache.avro.generic.GenericD= atumWriter.write(GenericDatumWriter.java:58)

at org.apache.avro.f= ile.DataFileWriter.append(DataFileWriter.java:257)

... 32 more

2 job failure(s) occurred:

(5): Depending job with jobID 1 failed.

com.apple.rsp.CrossValidation.Cros= sValidationDriver: [[Text(/Users/luren/Lu/Project/model_testing/training_se= t... ID=3D1 (5/6)(1): Job failed!







--
Director of Data Science
Twitter: @josh_wills
--001a114967804be9a20513a547af--