hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cliff Resnick <cresn...@proclivitysystems.com>
Subject quick fix for thread-safe connection pool
Date Fri, 28 Aug 2009 15:52:43 GMT
We're stepping up our Hive integration, and after confronting the 
HiveServer thread safety issue, I implemented a simple co-located hive 
connection pool. Since we use our own java-only network communication 
code, the service does not use Thrift; instead it's just a lightweight 
network service that manages a pool of HiveConnections.

Of course, after implementing this I found that it was still not 
thread-safe. I'm not very familiar with the hive code, but I did find a 
quick fix to be surprisingly easy. In 
org.apache.hadoop.hive.ql.exec.Utilities there is a static field 
instance of mapredWork. I changed it to a ThreadLocal instance and 
suddenly I have a thread-safe connection pool.

Now, I do understand that this is just a quick fix. The call stack from 
HiveStatement is synchronous, but the ThreadLocal solution would need to 
be revisited if any async processing is introduced. Also, I can't help 
wondering, why is a Utilities class holding state, let alone the state 
of the entire execution plan? Finally, I doubt such a simple solution 
solves hive-80, so I imagine other threading issues are in play when 
Thrift is involved. In the meantime however, I'm hoping the fix can be 
made, patch attached.

Cliff

Mime
View raw message