hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rita Liu <crystaldol...@gmail.com>
Subject Re: Hadoop basics
Date Sun, 15 Aug 2010 05:10:52 GMT
Thank you very much, Piyush! I'll do as you say :DD Thanks a lot!!

Thanks Smith :) hmm ... I see. ok :)

Please give me more guidance and suggestions if possible, dear experts!
-Rita :))

On Sat, Aug 14, 2010 at 10:09 PM, smith jack <thinke365@gmail.com> wrote:

> that means you can only trace by log,
> and not possible to debug hadoop using step debug, haha
> distributed system always introduce extra complexity and confusing issues.
>
> 2010/8/15 Piyush Garg <piyushgarg80@gmail.com>:
> > Hi Rita,
> >
> > You can put log4j logger debug statements in the code. log4j library is
> > part of hadoop framework and there is already a log4j.properties file in
> > hadoop conf directory and all the output logs are saved in hadoop logs
> > directory.
> >
> > Thanks and Regards
> > Piyush Garg
> >
> >
> > On Sunday 15 August 2010 10:20 AM, Rita Liu wrote:
> >> Thank you very much, Piyush! :) May I know more about how to use
> "traces"?
> >>
> >> And -- yes, please teach me if possible, experts! :)
> >>
> >> Thanks a lot,
> >> -Rita :))
> >>
> >> On Sat, Aug 14, 2010 at 9:42 PM, Piyush Garg <piyushgarg80@gmail.com>
> wrote:
> >>
> >>
> >>> Hi Rita,
> >>>
> >>> I have just started to learn hadoop as well, I know there is a long way
> >>> to go.
> >>> I found some useful links which I am sharing with you.
> >>>
> >>> Hadoop Tutorial - YDN
> >>> <http://developer.yahoo.com/hadoop/tutorial/index.html> excellent
> >>> beginners tutorial and well organized.
> >>> Running Hadoop On Ubuntu Linux (Single-Node Cluster) - Michael G. Noll
> >>> <
> >>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >>>
> >>>>
> >>> Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)
> >>> <
> >>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
> >>>
> >>>>
> >>> The tutorial on the hadoop wiki
> >>> <http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html>
is
> >>> too much for a beginner.
> >>>
> >>> Debugger:
> >>> I do not think you can easily do debugging using remote debugger. This
> >>> is natural since hadoop is not sequential programming, it would be very
> >>> difficult to debug its apps.
> >>> The only way to debug is to use traces.
> >>>
> >>> I think you can learn how to setup multi-node cluster, but for practice
> >>> session you can use single node setup.
> >>>
> >>> Lets see what the experts say.
> >>>
> >>> Thanks and Regards
> >>> Piyush Garg
> >>>
> >>>
> >>> On Sunday 15 August 2010 09:07 AM, Rita Liu wrote:
> >>>
> >>>> Hi!
> >>>>
> >>>> I am a total beginner, but I am very interested in hadoop. I've
> already
> >>>> downloaded hadoop 0.19.2 and run on Ubuntu in single-node mode. Now
I
> >>>>
> >>> want
> >>>
> >>>> to do two things:
> >>>>
> >>>> 1. Explore how hadoop works internally with one of the example
> >>>>
> >>> applications
> >>>
> >>>> hadoop provides
> >>>> 2. Write an application on my own
> >>>>
> >>>> Those two things bring me following questions:
> >>>>
> >>>> a. debugger?
> >>>> I am stuck since I don't know how to "explore" hadoop. I used to trace
> >>>> through the code using a debugger, but in this case, I don't know if
> >>>>
> >>> there
> >>>
> >>>> is a good debugger to use; or -- maybe a debugger is not necessary for
> >>>> hadoop? If not, then how do you trace through the code to either debug
> or
> >>>> just gain an understanding about the system? May I know what you,
> >>>> experienced experts, do? :)
> >>>>
> >>>> b. Where to run hadoop?
> >>>> Also -- may I know where you run your hadoop? Do you run on linux, or
> on
> >>>>
> >>> VM
> >>>
> >>>> -- in particular, Cloudera? I heard that Cloudera is good for writing
> >>>> mapreduce applications with hadoop itself as a blackbox; is it true?
> If
> >>>>
> >>> my
> >>>
> >>>> ultimate goal is to understand how hadoop works internally, would it
> be
> >>>> better if I directly run it on linux?
> >>>>
> >>>> c. Single-node or multi-node?
> >>>> In the beginning (just like my case :p) would it be better to use
> >>>> single-node or multi-node? If the latter is true, should I obtain more
> >>>> machines, or should I use more virtual machines to create more nodes?
> >>>>
> >>>> As a newbie, I am sorry for all those basic (and silly, I know :$)
> >>>> questions. If possible, please help me out? Any suggestion or advice
> will
> >>>>
> >>> be
> >>>
> >>>> greatly appreciated. Thank you very much!
> >>>>
> >>>> Best,
> >>>> Rita :)
> >>>>
> >>>> P.S. If my questions are not suitable for this mailing-list, please
> let
> >>>>
> >>> me
> >>>
> >>>> apologize, and then, could you please direct me to other
> mailing-lists?
> >>>> Sorry, and thanks a lot! :)
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message