hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasad Chakka <pcha...@facebook.com>
Subject Re: Problem with Thrift Server Concurrency
Date Tue, 28 Jul 2009 17:57:53 GMT
This is a known issue in Hive Server. This is because the same metastore client is being used
to issue both queries and JDBC does not like that. We should use thread specific or session
specific metastore clients but I don't think Hive Server is doing that right now. HIVE-584
is supposed to fix this issue.

________________________________
From: Matt Pestritto <matt@pestritto.com>
Reply-To: <hive-dev@hadoop.apache.org>
Date: Tue, 28 Jul 2009 10:48:24 -0700
To: <hive-dev@hadoop.apache.org>
Subject: Problem with Thrift Server Concurrency

Hi all

Does the Thrift server support concurrency ?  I'm having a problem that only
happens if I fire off multiple ( 2+ ) DML queries at the same time.
Randomly, one of the queries will succeed but the other will fail with the
following error I pulled from the hiveserver output:

java.io.IOException: cannot find dir =
hdfs://mustique:9000/user/hadoop/mantis-output/mantis-job/20090601 in
partToPartitionInfo!
    at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getTableDescFromPath(HiveInputFormat.java:311)
    at
org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(HiveInputFormat.java:288)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735)
    at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:388)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:357)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:263)
    at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:108)
    at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:302)
    at
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:290)
    at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

If I execute the queries via thrift a few seconds apart from each other, it
succeeds.  It only seems to fail if the queries start at about the same
time.

When I run the same two queries using *hive -e "query 1" & hive -e "query 2"
* is also works fine.

Any ideas ?

Thanks
-Matt


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message