hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Grover <mgro...@oanda.com>
Subject Amazon EMR Best Practices for Hive metastore
Date Wed, 07 Mar 2012 02:54:20 GMT
Hi all,
I am trying to get an idea of what people do for setting up Hive metastore when using Amazon
EMR.

For those of you using Amazon EMR:

1) Do you have a dedicated RDS instance external to your EMR Hive+Hadoop cluster that you
use as a persistent metastore for all your cluster instantiations?

2) Do you use the MySQL DB that comes pre-installed on the master node and export its data
(on cluster tear down) to something like S3 and import it from S3 during cluster bring up?

3) Do you use a local installation of Hive (instead of that on EMR) so that you could make
use of an in-house dedicated metastore while utilizing Hadoop cluster on EMR? (i.e. local
Hive + EMR Hadoop)

4) Do you do something really simple and naive like scripting up all your "create external
table" commands and running them every time you bring up a cluster?

Or, do you do something else not mentioned above?:-)

Thank you in advance for sharing!

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 



Mime
View raw message