hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jiang licht <>
Subject Re: Hive in EC2
Date Wed, 31 Aug 2011 04:34:21 GMT
That's true. If its not acceptable, then one can replace hive in emr cluster w/ latest version
and then reuse it or build image from latest hadoop and hive ...


From: Igor Tatarinov <>
To:; jiang licht <>
Sent: Tuesday, August 30, 2011 8:26 PM
Subject: Re: Hive in EC2

The only caveat is that you are at Amazon's mercy in terms of the latest version of Hive.
Also, they have their own versioning so EMR Hive's latest version 0.7.1 could be Apache Hive's
0.6.5 - I am not even sure. Basically, don't expect the latest Hive features to be available.


On Tue, Aug 30, 2011 at 6:25 PM, jiang licht <> wrote:

Recommend Amazon ElasticMapReduce. Otherwise, it costs you time to prepare and set up hadoop
and hive package for running on ec2. EMR does heavyweight lifting work for you and still allow
you option to customize your hadoop and hive by pointing to their property files in xml (e.g.
in S3). EMR also allows your hive job to run in batch mode (through emr client command tools
or amazon consoler) or in interactive mode for test/debug purpose. Another benefit of using
EMR/hive is that its hive has enhanced features otherwise not available, s.a., passing parameters
from command line, loading partitions automatically from S3 instead of loading them individually,
etc. Here's a link to emr faq and you may take a look at the answer to "Are there new features
in Hive specific to Amazon Elastic MapReduce?"
> From: "Aggarwal, Vaibhav" <>
>To: "" <>; "" <>
>Sent: Tuesday, August 30, 2011 11:51 AM
>Subject: RE: Hive in EC2
>You could also choose to look at Amazon ElasticMapReduce.
>It allows you to provision an EC2 cluster of your choice preinstalled with Hive and Hadoop.
>-----Original Message-----
>From: MIS [] 
>Sent: Monday, August 29, 2011
 11:03 PM
>To:; hive
>Subject: Hive in EC2
>Can somebody point me to production level setup of Hive in EC2. The intent is to know
the setup best practices being employed.
View raw message