hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varad Meru <meru.va...@gmail.com>
Subject Re: Learning hadoop
Date Thu, 23 Aug 2012 17:03:29 GMT
Hi Pravin,

Studying Hadoop or MapReduce can look a daunting task if you get your hand dirty at the start.

Some of the prerequisites for learning Hadoop are having a good experience in Java. Good Analytical
skills help a lot as well and final secret sauce for being successful is – you need to be
motivated to self learn lot of things in the bigdata arena.

I followed the schedule as follows :

Start with very basics of MR with 
	http://code.google.com/edu/parallel/dsd-tutorial.html 
	http://code.google.com/edu/parallel/mapreduce-tutorial.html 
Then go for the first two lectures in 
	http://www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course
intro to MapReduce and Hadoop.
Read the seminal paper 
	http://labs.google.com/papers/mapreduce.html 
and its improvements in the updated version
	http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdf

Then go for all the other videos in the U.Washington link given above. (For more details into
Distributed Systems

Try youtubing the terms Map reduce and hadoop to find videos by O'Rielly and Google RoundTable
for good overview of the future of Hadoop and MapReduce

Then off to the most important videos - 
Cloudera Videos
http://www.cloudera.com/resources/?media=Video
and 
Google MiniLecture Series
http://code.google.com/edu/submissions/mapreduce-minilecture/listing.html

Along with all the Multimedia above we need good written material

Documents:
Architecture diagrams at http://hadooper.blogspot.com are good to have on your wall

Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as

Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop.

Pro Hadoop is good for more advanced stuff such as chaining and Spring-Hadoop

PDFs of the documentation from Apache Foundation 
	http://hadoop.apache.org/common/docs/current/ and 
	http://hadoop.apache.org/common/docs/stable/

will help you learn as to how model your problem into a MR solution in order to gain the advantages
of Hadoop in total.

HDFS paper by Yahoo!  Research is also a good read in order to gain in depth knowledge of
hadoop (ACM: http://dl.acm.org/citation.cfm?id=1914427 and DL at http://storageconference.org/2010/Papers/MSST/Shvachko.pdf


Try the http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert
path to Hadoop (Warning Hadoop 0.19 Version used)

Imp for Setting up Hadoop: Here is a more recent tutorial on setting up Hadoop:
http://orzota.com/blog/single-node-hadoop-setup-2/

And here is one on configuring Eclipse for hadoop development:
http://orzota.com/blog/eclipse-setup-for-hadoop-development/

For Any Queries ... 
Contact Apache, Google, Bing, Yahoo!

Thanks,
Varad

On 23-Aug-2012, at 10:19 PM, emmanuel.csantana@gmail.com wrote:

> Then perhaps try downloading Cloudera tarballs and run some jobs in pseudo distributed
mode in your local linux.
> Using amazon ec2 machines to configure a small cluster will also be a nice experiment.
> 
> Emmanuel
> 
> 2012/8/23 Mohit Anchlia <mohitanchlia@gmail.com>
> start with reading map reduce paper and then look at hadoop book 
> 
> On Thu, Aug 23, 2012 at 9:19 AM, Pravin Sinha <pks_chennai@yahoo.com> wrote:
> Hi,
> 
> I am new to Hadoop. What would be the best way to learn  hadoop and eco system around
it?
> 
> Thanks,
> Pravin
> 
> 
> 
> 
> 
> -- 
> Emmanuel de Castro Santana




Mime
View raw message