crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-363) Cogroup using Protobufs with WritableTypeFamily throws Proto Exception
Date Wed, 12 Mar 2014 23:25:45 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932643#comment-13932643
] 

Micah Whitacre commented on CRUNCH-363:
---------------------------------------

I haven't landed on the exact cause of the problem but I think one contributing factor is
that I think we are wrapping the protos in two layers of BytesWritable but only removing one.
 Specifically at this code[1], the "w" refers to a BytesWritable already.  I tweaked the code
to confirm my suspicion so it now looks like this:

      Writable w = (Writable) fns.get(index).map(input.getValue());
      if(w instanceof BytesWritable){
          return new UnionWritable(index, (BytesWritable) w);
      }else{
          return new UnionWritable(index, new BytesWritable(WritableUtils.toByteArray(w)));
      }

(horrible code I know)

The MR job now completes without issue but fails with the same exception during the materialize.
 So I think the issue is actually related to the deserialization process vs serialization.

[1] - https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/types/writable/Writables.java#L655

> Cogroup using Protobufs with WritableTypeFamily throws Proto Exception
> ----------------------------------------------------------------------
>
>                 Key: CRUNCH-363
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-363
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.2
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>         Attachments: CRUNCH-363-test.patch
>
>
> If you have code like the following:
> {code}
> PTable<String, Proto> t1 = ...
> PTable<String, Proto> t2 = ...
> ti.cogroup(t2);
> {code}
> Where the PType of each table was created using:
> ptf.tableOf(ptf.strings(), PTypes.protos(Person.class, ptf));
> and "ptf" is an instance of WritableTypeFamily.
> You will get an exception like the following.
> {quote}
> org.apache.crunch.CrunchRuntimeException: com.google.protobuf.InvalidProtocolBufferException:
Protocol message contained an invalid tag (zero).
> 	at org.apache.crunch.types.PTypes$ProtoInputMapFn.map(PTypes.java:191)
> 	at org.apache.crunch.types.PTypes$ProtoInputMapFn.map(PTypes.java:160)
> 	at org.apache.crunch.fn.CompositeMapFn.map(CompositeMapFn.java:63)
> 	at org.apache.crunch.types.writable.Writables$UWInputFn.map(Writables.java:611)
> 	at org.apache.crunch.types.writable.Writables$UWInputFn.map(Writables.java:573)
> 	at org.apache.crunch.types.PGroupedTableType$HoldLastIterator.next(PGroupedTableType.java:84)
> 	at org.apache.crunch.lib.Cogroup$PostGroupFn.map(Cogroup.java:275)
> 	at org.apache.crunch.lib.Cogroup$PostGroupFn.map(Cogroup.java:250)
> 	at org.apache.crunch.fn.PairMapFn.map(PairMapFn.java:62)
> 	at org.apache.crunch.fn.PairMapFn.map(PairMapFn.java:26)
> 	at org.apache.crunch.MapFn.process(MapFn.java:34)
> 	at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
> 	at org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
> 	at org.apache.crunch.MapFn.process(MapFn.java:34)
> 	at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
> 	at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
> 	at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:262)
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained
an invalid tag (zero).
> 	at com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
> 	at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108)
> 	at org.apache.crunch.lib.PersonProtos$Person.<init>(PersonProtos.java:101)
> 	at org.apache.crunch.lib.PersonProtos$Person.<init>(PersonProtos.java:65)
> 	at org.apache.crunch.lib.PersonProtos$Person$1.parsePartialFrom(PersonProtos.java:153)
> 	at org.apache.crunch.lib.PersonProtos$Person$1.parsePartialFrom(PersonProtos.java:148)
> 	at org.apache.crunch.lib.PersonProtos$Person$Builder.mergeFrom(PersonProtos.java:484)
> 	at org.apache.crunch.lib.PersonProtos$Person$Builder.mergeFrom(PersonProtos.java:369)
> 	at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:196)
> 	at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:898)
> 	at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> 	at org.apache.crunch.types.PTypes$ProtoInputMapFn.map(PTypes.java:189)
> 	... 20 more
> {quote}
> Note you don't get the same exception for AvroTypeFamily.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message