crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucy Chen <lucychen2014f...@gmail.com>
Subject Exception from Set.difference
Date Thu, 02 Apr 2015 21:00:59 GMT
Hi,

     I am trying to do Set difference as follows:

PCollection<MyClass> C = Set.difference(A, B);


Here both A and B are PCollection<MyClass> type.


MyClass is defined as follows:


public class *MyClass* implements java.io.Serializable, Cloneable{

 private String a;

private String b;

private int c;

private Map<String, Double> d;

private int e;

 public MyClass(){

this(null, null, 0, new HashMap<String, Double>());

}

 public MyClass(String labelID, String sampleID, Integer pos_neg_ind,
HashMap<String, Double> feat_val_pair){

......

        }

        public MyClass(String input){

         .....

         }

         .....

}


      From running the set difference, I got the following error. Was that
because of MyClass including a Map member d? If so, is there another way to
generate the set diff by having these inputs?


      Thanks!


Lucy


java.lang.Exception:
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: Error while
doing final merge

at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError:
Error while doing final merge

at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:160)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)

at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.AvroRuntimeException: Can't compare maps!

at org.apache.avro.io.BinaryData.compare(BinaryData.java:134)

at org.apache.avro.io.BinaryData.compare(BinaryData.java:139)

at org.apache.avro.io.BinaryData.compare(BinaryData.java:92)

at org.apache.avro.io.BinaryData.compare(BinaryData.java:72)

at
org.apache.avro.mapred.AvroKeyComparator.compare(AvroKeyComparator.java:43)

at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:578)

at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:144)

at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:108)

at
org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:524)

at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:539)

at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:209)

at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:731)

at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.close(MergeManagerImpl.java:370)

at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:158)

... 7 more

Mime
View raw message