hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Mulka <mu...@umich.edu>
Subject Re: Doubts related to Amazon EMR
Date Tue, 24 Apr 2012 04:22:27 GMT
Just wrote up an article on how to install Sqoop on Amazon EMR:
http://blog.kylemulka.com/2012/04/how-to-install-sqoop-on-amazon-elastic-map-reduce-emr/

--
Kyle Mulka
mulka@umich.edu
206 883 5352
http://www.kylemulka.com


On Mon, Apr 23, 2012 at 10:55 AM, Kyle Mulka <kyle.mulka@gmail.com> wrote:

> It is possible to install Sqoop on AWS EMR. I've got some scripts I can
> publish later. You are not required to use S3 to store files and can use
> the local (temporary) HDFS instead. After you have Sqoop installed, you can
> import your data with it into HDFS, run your calculations in HDFS, then
> export your data back out using Sqoop again.
>
> --
> Kyle Mulka
> http://www.kylemulka.com
>
> On Apr 23, 2012, at 8:42 AM, Bhavesh Shah <bhavesh25shah@gmail.com> wrote:
>
>
> Hello all,
> I want to deploy my task on Amazon EMR. But as I am new to Amazon Web
> Services I am confused in understanding the concepts.
>
> My Use Case:
>
> I want to import the large data from EC2 through SQOOP into the Hive.
> Imported data in Hive will get processed in Hive by applying some algorithm
> and will generate some result (in table form, in Hive only). And generated
> result will be exported back to Ec2 again through SQOOP only.
>
> I am new to Amazon Web Services and want to implement this use case with
> the help of AWS EMR. I have implemented it on local machine.
>
> I have read some links related to AWS EMR for launching the instance and
> about what is EMR, How it works and etc...
> I have some doubts about EMR like:
>
> 1) EMR uses S3 Buckets, which holds Input and Output data Hadoop
> Processing (in the form of Objects). ---> I didn't get How to store the
> data in the form of Objects on S3 (My data will be files)
>
> 2) As already said I have implemented a task for my use case in Java. So
> If I create the JAR of my program and create the Job Flow with Custom JAR.
> Will it be possible to implement like this or do need to do some thing
> extra for that?
>
> 3) As I said in my Use Case that I want to export my result back to Ec2
> with the help of SQOOP. Does EMR have support of SQOOP?
>
>
> If you have any kind of idea related to AWS, please reply me with your
> answer as soon as possible. I want to do this as early as possible.
>
> many Thanks.
>
>
>
> --
> Regards,
> Bhavesh Shah
>
>

Mime
View raw message