hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rita Liu <crystaldol...@gmail.com>
Subject Hadoop basics
Date Sun, 15 Aug 2010 03:37:15 GMT

I am a total beginner, but I am very interested in hadoop. I've already
downloaded hadoop 0.19.2 and run on Ubuntu in single-node mode. Now I want
to do two things:

1. Explore how hadoop works internally with one of the example applications
hadoop provides
2. Write an application on my own

Those two things bring me following questions:

a. debugger?
I am stuck since I don't know how to "explore" hadoop. I used to trace
through the code using a debugger, but in this case, I don't know if there
is a good debugger to use; or -- maybe a debugger is not necessary for
hadoop? If not, then how do you trace through the code to either debug or
just gain an understanding about the system? May I know what you,
experienced experts, do? :)

b. Where to run hadoop?
Also -- may I know where you run your hadoop? Do you run on linux, or on VM
-- in particular, Cloudera? I heard that Cloudera is good for writing
mapreduce applications with hadoop itself as a blackbox; is it true? If my
ultimate goal is to understand how hadoop works internally, would it be
better if I directly run it on linux?

c. Single-node or multi-node?
In the beginning (just like my case :p) would it be better to use
single-node or multi-node? If the latter is true, should I obtain more
machines, or should I use more virtual machines to create more nodes?

As a newbie, I am sorry for all those basic (and silly, I know :$)
questions. If possible, please help me out? Any suggestion or advice will be
greatly appreciated. Thank you very much!

Rita :)

P.S. If my questions are not suitable for this mailing-list, please let me
apologize, and then, could you please direct me to other mailing-lists?
Sorry, and thanks a lot! :)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message