hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Preethi Vinayak Ponangi <vinayakpona...@gmail.com>
Subject Re: Application of Cloudera Hadoop for Dataset analysis
Date Tue, 05 Feb 2013 16:07:47 GMT
It depends on what part of the Hadoop Eco system component you would like
to use.

You can do it in several ways:

1) You could write a basic map reduce job to do joins.
This link could help or just a basic search on google would give you
several links.

http://chamibuddhika.wordpress.com/2012/02/26/joins-with-map-reduce/

2) You could use an abstract language like Pig to do these joins using
simple pig scripts.
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html

3) The simplest of all, you could write SQL like queries to do this join
using Hive.
http://hive.apache.org/

Hope this helps.

Regards,
Vinayak.


On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <suresh@hortonworks.com>wrote:

> Please take this thread to CDH mailing list.
>
>
> On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
> sharathchandra92@gmail.com> wrote:
>
>> Hi,
>>
>> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
>> would like to get the following clarifications regarding cloudera hadoop
>> distribution. I am using a CDH4 Demo VM for now.
>>
>> 1. After I upload the files into the file browser, if I have to link
>> two-three datasets using a key in those files, what should I do? Do I have
>> to run a query over them?
>>
>> 2. My objective is that I have some data collected over a few years and
>> now, I would like to link all of them, as in a database using keys and then
>> run queries over them to find out particular patterns. Later I would like
>> to implement some Machine learning algorithms on them for predictive
>> analysis. Will this be possible on the demo VM?
>>
>> I am totally new to this. Can I get some help on this? I would be very
>> grateful for the same.
>>
>>
>> ------------------------------------------------------------------------------
>> Thanks and Regards,
>> *Sharath Chandra Guntuku*
>> Undergraduate Student (Final Year)
>> *Computer Science Department*
>> *Email*: f2009149@hyderabad.bits-pilani.ac.in
>>
>> *BITS-Pilani*, Hyderabad Campus
>> Jawahar Nagar, Shameerpet, RR Dist,
>> Hyderabad - 500078, Andhra Pradesh
>>
>
>
>
> --
> http://hortonworks.com/download/
>

Mime
View raw message