crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-338) TupleDeepCopier throws java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.avro.generic.IndexedRecord
Date Wed, 05 Feb 2014 08:02:10 GMT


Gabriel Reid commented on CRUNCH-338:

Thanks for the stack trace. I'm still having a hard time reproducing this issue in my own
code, but I do have one thing that would be good to check off (that a colleague here encountered

The constructor (and static factory methods) for TupleN take a varargs parameter, and the
issue that I saw someone else have with this recently was that they were passing a List to
TupleN.of() instead of an array. If something is done with the TupleN before it needs to get
serialized, this won't throw an exception, and it's one of the places where the compiler won't
catch anything either. Having two branches coming off of the same PTable causes the deep copier
to be used (which implicitly uses serialization to perform deep copies), and so it could cause
this error to be thrown.

I'm able to replicate the same stack trace by "forgetting" to pass an array (instead of list)
of elements when constructing a TupleN, so I'd like to rule that option out. Can you double-check
how the TupleN is being constructed in your case? And if that looks fine, could you try posting
a little example that demonstrates the error? I'll post the example code that I've put together
to try to replicate this issue.

> TupleDeepCopier throws java.lang.ClassCastException: java.util.ArrayList cannot be cast
to org.apache.avro.generic.IndexedRecord
> --------------------------------------------------------------------------------------------------------------------------------
>                 Key: CRUNCH-338
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.2
>            Reporter: Laxmikanth Samudrala
>            Assignee: Josh Wills
>         Attachments:, stack-trace.log
> when PTable<String, TupleN> using twice and performing parallelDo causing java.lang.ClassCastException;
when the same PTable<String, TupleN> used once for parallelDo not causing the exception
or turning PTable<String,TupleN> to PTable<String, Pair<Tuple4.Collect<?, ?,
?, ?>, Tuple3.Collect<?, ?, ?>>>  and using paired PTable twice for parallelDo
not causing any exception. 
> Note : The root cause seem's expressing the items passed to the TupleN is collection
or a single instance. Suprisingly when we are performing parallelDo operation once on PTable<String,
TupleN> is working with no exceptions and when performing parallelDo twice seem's try to
make use of TupleDeepCopier.deepCopy; which is triggering the exception.
> Template of Code :
> Failure case :
> PTable<String, TupleN> entityData = .....;
> entityData.parallelDo(.....);
> entityData.parallelDo(.....);
> Success Case :
> PTable<String, TupleN> entityData = .....;
> entityData.parallelDo(.....);
> Another success case :
> PTable<String, Pair<Tuple4.Collect<?, ?, ?, ?>, Tuple3.Collect<?, ?, ?>>>
> entityData.parallelDo(.....);
> entityData.parallelDo(.....);
> stack trace for reference :
> org.apache.crunch.CrunchRuntimeException: Error while deep copying avro value [.........]
> 	at org.apache.crunch.types.avro.AvroDeepCopier.deepCopy(
> 	at org.apache.crunch.types.avro.AvroDeepCopier$AvroSpecificDeepCopier.deepCopy(
> 	at org.apache.crunch.types.avro.AvroType.getDetachedValue(
> 	at org.apache.crunch.types.TupleDeepCopier.deepCopy(
> 	at org.apache.crunch.types.TupleDeepCopier.deepCopy(
> 	at org.apache.crunch.types.avro.AvroType.getDetachedValue(
> 	at org.apache.crunch.lib.PTables.getDetachedValue(
> 	at org.apache.crunch.types.avro.AvroTableType.getDetachedValue(
> 	at org.apache.crunch.types.avro.AvroTableType.getDetachedValue(
> 	at
> 	at org.apache.crunch.MapFn.process(
> 	at
> 	at
> 	at org.apache.crunch.MapFn.process(
> 	at
> 	at
> 	at
> 	at
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(
> 	at
> 	at org.apache.hadoop.mapred.LocalJobRunner$

This message was sent by Atlassian JIRA

View raw message