incubator-oozie-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject a hive thrift alternative
Date Mon, 30 Apr 2012 17:03:32 GMT
HaHa. I never rejoined the list after it moved from Yahoo.

I would not describe hive-thrift as horrible but there is some unpleasantness.

Near future:
https://issues.apache.org/jira/browse/HIVE-2935

In any case I am willing to accept the issues. I run multiple
hive-thrift servers behind ha-proxy

http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/running_a_hive_thrift_cluster

This cuts downs concurrency type problems. It's hive so not sure how
much concurrency is needed there.

Our group just decided to part ways with programming over the CLI. Too
much stuff like this:

hive -e -S "select x,y from $TABLE WHERE $STUFF" | awk whatever
or:
my list=`hadoop dfs -ls /bla`

That was not unit testable and just really ugly. Even if it fails
1/1000 times we have try catch , and we have done stuff that can bring
up the entire stack end to end in an IDE now.

Layering on top of the CLI is a bad idea in the long run, its like
expect scripting an ssh session. Not that it was a bad design chose
for oozie at the time but it is certainly not the ideal way to handle
it.

Mime
View raw message