hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sharath Chandra Guntuku <sharathchandr...@gmail.com>
Subject Application of Cloudera Hadoop for Dataset analysis
Date Tue, 05 Feb 2013 10:43:30 GMT

I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I
would like to get the following clarifications regarding cloudera hadoop
distribution. I am using a CDH4 Demo VM for now.

1. After I upload the files into the file browser, if I have to link
two-three datasets using a key in those files, what should I do? Do I have
to run a query over them?

2. My objective is that I have some data collected over a few years and
now, I would like to link all of them, as in a database using keys and then
run queries over them to find out particular patterns. Later I would like
to implement some Machine learning algorithms on them for predictive
analysis. Will this be possible on the demo VM?

I am totally new to this. Can I get some help on this? I would be very
grateful for the same.

Thanks and Regards,
*Sharath Chandra Guntuku*
Undergraduate Student (Final Year)
*Computer Science Department*
*Email*: f2009149@hyderabad.bits-pilani.ac.in

*BITS-Pilani*, Hyderabad Campus
Jawahar Nagar, Shameerpet, RR Dist,
Hyderabad - 500078, Andhra Pradesh

View raw message