hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sofia <sofia.panagiot...@taiger.com>
Subject Re: troubleshooting: "unread block data' error
Date Wed, 23 Dec 2015 16:52:41 GMT
Hi Xuefu Zhang,

I just tried again Hive with Spark after a long time.
When I run queries that do not touch HBase it works fine at cluster mode.
The problem occurs when I try to run a query that is on HBase (obviously through Hive)



> On 19 Nov 2015, at 20:54, Xuefu Zhang <xzhang@cloudera.com> wrote:
> 
> Are you able to run queries that are not touching HBase? This problem were seen before
but fixed.
> 
> On Tue, Nov 17, 2015 at 3:37 AM, Sofia <sofia.panagiotidi@taiger.com <mailto:sofia.panagiotidi@taiger.com>>
wrote:
> Hello,
> 
> I have configured Hive to work Spark. 
> 
> I have been trying to run a query on a Hive table managing an HBase table (created via
HBaseStorageHandler) at the Hive CLI.
> 
> When spark.master is “local" it works just fine, but when I set it to my spark master
spark://spark-master:707 <>7 I get the following error:
> 
> 
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 10:49:30 INFO
scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1]
at mapPartitionsToPair at MapTran.java:31)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 10:49:30 INFO
scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 10:49:30 INFO
scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 192.168.1.64, ANY, 1688 bytes)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/17 10:49:30 WARN
scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.1.64): java.lang.IllegalStateException:
unread block data
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2428)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 15/11/17 10:49:30 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.lang.Thread.run(Thread.java:745)
> 
> 
> I read something about the guava.jar missing but I am not sure how to fix it. 
> I am using Spark 1.4.1, HBase 1.1.2 and Hive 1.2.1.
> Any help more than appreciated.
> 
> Sofia
> 


Mime
View raw message