kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Spark-kudu: java.lang.IllegalArgumentException: Got out-of-order primary key column
Date Mon, 18 Apr 2016 15:06:45 GMT
Hi Darren,

That particular error means that the schema was created with key columns
specified after non-key columns, a current limitation in Kudu.  It seems
like internally Spark is creating a schema with that kind of setup?

J-D

On Mon, Apr 18, 2016 at 7:20 AM, Darren Hoo <darren.hoo@gmail.com> wrote:

> what does this Exception mean?
>
> I Just do an inner join of two kudu tables, and I got this:
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent
> failure: Lost task 0.3 in stage 2.0 (TID 22, slave12):
> java.lang.IllegalArgumentException: Got out-of-order primary key column:
> Column name: cid, type: int64
>
>         at org.kududb.Schema.<init>(Schema.java:110)
>
>         at org.kududb.Schema.<init>(Schema.java:74)
>
>         at
> org.kududb.client.AsyncKuduScanner.<init>(AsyncKuduScanner.java:313)
>
>         at
> org.kududb.client.KuduScanner$KuduScannerBuilder.build(KuduScanner.java:131)
>
>         at
> org.kududb.mapreduce.KuduTableInputFormat$TableRecordReader.initialize(KuduTableInputFormat.java:386)
>
>         at
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:158)
>
>         at
> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:129)
>
>         at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:64)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>
>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:745)
>
>
> the same SQL runs ok on impala, and the same  code runs ok on spark
> 1.5(cdh 5.5), but fails with spark 1.6 (cdh 5.7).
>
> what possible thing can I do wrong?
>
>
>

Mime
View raw message