hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piyush Garg <piyushgar...@gmail.com>
Subject Re: Hadoop basics
Date Sun, 15 Aug 2010 04:42:16 GMT
Hi Rita,

I have just started to learn hadoop as well, I know there is a long way
to go.
I found some useful links which I am sharing with you.

Hadoop Tutorial - YDN
<http://developer.yahoo.com/hadoop/tutorial/index.html> excellent
beginners tutorial and well organized.
Running Hadoop On Ubuntu Linux (Single-Node Cluster) - Michael G. Noll
The tutorial on the hadoop wiki
<http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html> is
too much for a beginner.

I do not think you can easily do debugging using remote debugger. This
is natural since hadoop is not sequential programming, it would be very
difficult to debug its apps.
The only way to debug is to use traces.

I think you can learn how to setup multi-node cluster, but for practice
session you can use single node setup.

Lets see what the experts say.

Thanks and Regards
Piyush Garg

On Sunday 15 August 2010 09:07 AM, Rita Liu wrote:
> Hi!
> I am a total beginner, but I am very interested in hadoop. I've already
> downloaded hadoop 0.19.2 and run on Ubuntu in single-node mode. Now I want
> to do two things:
> 1. Explore how hadoop works internally with one of the example applications
> hadoop provides
> 2. Write an application on my own
> Those two things bring me following questions:
> a. debugger?
> I am stuck since I don't know how to "explore" hadoop. I used to trace
> through the code using a debugger, but in this case, I don't know if there
> is a good debugger to use; or -- maybe a debugger is not necessary for
> hadoop? If not, then how do you trace through the code to either debug or
> just gain an understanding about the system? May I know what you,
> experienced experts, do? :)
> b. Where to run hadoop?
> Also -- may I know where you run your hadoop? Do you run on linux, or on VM
> -- in particular, Cloudera? I heard that Cloudera is good for writing
> mapreduce applications with hadoop itself as a blackbox; is it true? If my
> ultimate goal is to understand how hadoop works internally, would it be
> better if I directly run it on linux?
> c. Single-node or multi-node?
> In the beginning (just like my case :p) would it be better to use
> single-node or multi-node? If the latter is true, should I obtain more
> machines, or should I use more virtual machines to create more nodes?
> As a newbie, I am sorry for all those basic (and silly, I know :$)
> questions. If possible, please help me out? Any suggestion or advice will be
> greatly appreciated. Thank you very much!
> Best,
> Rita :)
> P.S. If my questions are not suitable for this mailing-list, please let me
> apologize, and then, could you please direct me to other mailing-lists?
> Sorry, and thanks a lot! :)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message