hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish Thusoo <athu...@facebook.com>
Subject RE: quick fix for thread-safe connection pool
Date Fri, 28 Aug 2009 22:07:09 GMT
Hi Cliff,

Can you post the patch to the JIRA?

Ashish 

-----Original Message-----
From: Cliff Resnick [mailto:cresnick@proclivitysystems.com] 
Sent: Friday, August 28, 2009 8:53 AM
To: hive-dev@hadoop.apache.org
Subject: quick fix for thread-safe connection pool

We're stepping up our Hive integration, and after confronting the HiveServer thread safety
issue, I implemented a simple co-located hive connection pool. Since we use our own java-only
network communication code, the service does not use Thrift; instead it's just a lightweight
network service that manages a pool of HiveConnections.

Of course, after implementing this I found that it was still not thread-safe. I'm not very
familiar with the hive code, but I did find a quick fix to be surprisingly easy. In org.apache.hadoop.hive.ql.exec.Utilities
there is a static field instance of mapredWork. I changed it to a ThreadLocal instance and
suddenly I have a thread-safe connection pool.

Now, I do understand that this is just a quick fix. The call stack from HiveStatement is synchronous,
but the ThreadLocal solution would need to be revisited if any async processing is introduced.
Also, I can't help wondering, why is a Utilities class holding state, let alone the state
of the entire execution plan? Finally, I doubt such a simple solution solves hive-80, so I
imagine other threading issues are in play when Thrift is involved. In the meantime however,
I'm hoping the fix can be made, patch attached.

Cliff

Mime
View raw message