hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yuemeng1 <yueme...@huawei.com>
Subject Re: Job aborted due to stage failure
Date Tue, 02 Dec 2014 12:13:48 GMT
hi,XueFu
i checkout a spark branch from sparkgithub(tags:v1.2.0-snapshot0)and i 
compare this spark's pom.xml with spark-parent-1.2.0-SNAPSHOT.pom(get 
from 
http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark_2.10-1.2-SNAPSHOT/org/apache/spark/spark-parent/1.2.0-SNAPSHOT/),and

there is only difference is follow:
in spark-parent-1.2.0-SNAPSHOT.pom
   <artifactId>spark-parent</artifactId>
   <version>1.2.0-SNAPSHOT</version>
and in v1.2.0-snapshot0
<artifactId>spark-parent</artifactId>
   <version>1.2.0</version>
i think there is no essence diff,and i built v1.2.0-snapshot0 and deploy 
it as my spark clusters
when i run query about join two table ,it still give some error what i 
show u earlier

Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, 
most recent failure: Lost task 0.3 in stage 1.0 (TID 7, datasight18): 
java.lang.NullPointerException+details

Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 1.0 (TID 7, datasight18): java.lang.NullPointerException
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
	at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:233)
	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:56)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:722)

Driver stacktrace:



i think my spark clusters did't had any problem,but why always give me 
such error






















On 2014/12/2 13:39, Xuefu Zhang wrote:
> You need to build your spark assembly from spark 1.2 branch. this 
> should give your both a spark build as well as spark-assembly jar, 
> which you need to copy to Hive lib directory. Snapshot is fine, and 
> spark 1.2 hasn't been released yet.
>
> --Xuefu
>
> On Mon, Dec 1, 2014 at 7:41 PM, yuemeng1 <yuemeng1@huawei.com 
> <mailto:yuemeng1@huawei.com>> wrote:
>
>
>
>     hi.XueFu,
>     thanks a lot for your inforamtion,but as far as i know ,the latest
>     spark version on github is spark-snapshot-1.3,but there is no
>     spark-1.2,only have a branch-1.2 with spark-snapshot-1.2,can u
>     tell me which spark version i should built,and for now,that's
>     spark-assembly-1.2.0-SNAPSHOT-hadoop2.4.0.jar produce error like that
>
>
>     On 2014/12/2 11:03, Xuefu Zhang wrote:
>>     It seems that wrong class, HiveInputFormat, is loaded. The
>>     stacktrace is way off the current Hive code. You need to build
>>     Spark 1.2 and copy spark-assembly jar to Hive's lib directory and
>>     that it.
>>
>>     --Xuefu
>>
>>     On Mon, Dec 1, 2014 at 6:22 PM, yuemeng1 <yuemeng1@huawei.com
>>     <mailto:yuemeng1@huawei.com>> wrote:
>>
>>         hi,i built a hive on spark package and my spark assembly jar
>>         is spark-assembly-1.2.0-SNAPSHOT-hadoop2.4.0.jar,when i run a
>>         query in hive shell,before execute this query,
>>         i set all the  require which hive need with spark.and i
>>         execute a join query :
>>         select distinct st.sno,sname from student st join score sc
>>         on(st.sno=sc.sno) where sc.cno IN(11,12,13) and st.sage > 28;
>>         but it failed,
>>         get follow error in spark webUI:
>>         Job aborted due to stage failure: Task 0 in stage 1.0 failed
>>         4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID
>>         7, datasight18): java.lang.NullPointerException+details
>>
>>         Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most
recent failure: Lost task 0.3 in stage 1.0 (TID 7, datasight18): java.lang.NullPointerException
>>         	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
>>         	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
>>         	at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
>>         	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
>>         	at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:233)
>>         	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:210)
>>         	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:99)
>>         	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>>         	at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>>         	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>         	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>>         	at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>>         	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>>         	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>         	at org.apache.spark.scheduler.Task.run(Task.scala:56)
>>         	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>>         	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>         	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>         	at java.lang.Thread.run(Thread.java:722)
>>
>>         Driver stacktrace:
>>
>>         can u give me a help to deal this probelm,and i think my
>>         built was succussed!
>>
>>
>
>


Mime
View raw message