hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Preethi Vinayak Ponangi <vinayakpona...@gmail.com>
Subject Re: Application of Cloudera Hadoop for Dataset analysis
Date Tue, 05 Feb 2013 16:07:47 GMT
It depends on what part of the Hadoop Eco system component you would like
to use.

You can do it in several ways:

1) You could write a basic map reduce job to do joins.
This link could help or just a basic search on google would give you
several links.


2) You could use an abstract language like Pig to do these joins using
simple pig scripts.

3) The simplest of all, you could write SQL like queries to do this join
using Hive.

Hope this helps.


On Tue, Feb 5, 2013 at 10:00 AM, Suresh Srinivas <suresh@hortonworks.com>wrote:

> Please take this thread to CDH mailing list.
> On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku <
> sharathchandra92@gmail.com> wrote:
>> Hi,
>> I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
>> would like to get the following clarifications regarding cloudera hadoop
>> distribution. I am using a CDH4 Demo VM for now.
>> 1. After I upload the files into the file browser, if I have to link
>> two-three datasets using a key in those files, what should I do? Do I have
>> to run a query over them?
>> 2. My objective is that I have some data collected over a few years and
>> now, I would like to link all of them, as in a database using keys and then
>> run queries over them to find out particular patterns. Later I would like
>> to implement some Machine learning algorithms on them for predictive
>> analysis. Will this be possible on the demo VM?
>> I am totally new to this. Can I get some help on this? I would be very
>> grateful for the same.
>> ------------------------------------------------------------------------------
>> Thanks and Regards,
>> *Sharath Chandra Guntuku*
>> Undergraduate Student (Final Year)
>> *Computer Science Department*
>> *Email*: f2009149@hyderabad.bits-pilani.ac.in
>> *BITS-Pilani*, Hyderabad Campus
>> Jawahar Nagar, Shameerpet, RR Dist,
>> Hyderabad - 500078, Andhra Pradesh
> --
> http://hortonworks.com/download/

View raw message