hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronnie Ghose <ronnie.gh...@gmail.com>
Subject Re: How to understand Hadoop source code ?
Date Thu, 18 Apr 2013 17:47:12 GMT
+1 I'm one of those new people :)


On Thu, Apr 18, 2013 at 1:32 PM, Noelle Jakusz (c) <njakusz@vmware.com>wrote:

> +1
>
> There are quite a few new people, so maybe start a collaborative group
> where you can collect notes and steps (videos and articles). I know I would
> have some for you that I have created as I have gotten started... it would
> be a great idea to post them after some collaboration and review.
>
> Thanks Chris for the detailed reply...
>
> -----Original Message-----
> From: Chris Nauroth [mailto:cnauroth@hortonworks.com]
> Sent: Thursday, April 18, 2013 1:14 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: How to understand Hadoop source code ?
>
> Is there a specific bug fix or feature that you are trying to contribute?
>  Specific questions like "how can I help with jira X?" or "what is the
> main entry point when I run the hdfs command?" or "where does the namenode
> serialize metadata to disk" or "where does the secondary namenode execute a
> checkpoint" can help focus the conversation.
>
> AFAIK, we don't have a general code walkthrough document focused on
> onboarding new engineers.  This could be a valuable contribution if you
> want to gather notes while you learn.  I think this always works best if
> it's driven by a new engineer with review by an expert.  (If the experts
> write it, then they might accidentally skip something non-obvious that
> they've already internalized.)
>
> Since that document doesn't exist yet, the other option is to do some
> reading of the code, ideally while trying to fix a specific bug that has
> been filed in jira.  Like you said, it's a relatively large codebase, so
> it's impractical to read the whole thing top-to-bottom.  Instead, it's
> important to look for high-level clues that steer you towards the right
> files.  I've found that the Maven module structure and the Java package
> names are usually descriptive enough to steer me in the right direction.
>  If you focus on getting familiar with those, you'll basically build a
> btree inside your brain that helps you index into the right part of the
> codebase and answer your own questions rapidly.  Several examples:
>
> "Where is the main entry point for the datanode daemon?": module
> hadoop-hdfs, package org.apache.hadoop.hdfs.server.datanode
>
> "What is the algorithm for rebalancing an unbalanced cluster?": module
> hadoop-hdfs, package org.apache.hadoop.hdfs.server.balancer
>
> "How does YARN launch a new container process?": module
> hadoop-yarn-server-nodemanager, package
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher
>
> "Multiple daemons publish JMX metrics as a common concern.  Where is that
> implemented?": module hadoop-common, package org.apache.hadoop.metrics2
>
> I hope this is helpful to get the process started for you.  We're always
> here to help if you have specific follow-up questions.
>
> Thanks,
> --Chris
>
>
> On Wed, Apr 17, 2013 at 10:33 PM, Prabakaran Krishnan <
> prabakaran_j2ee@yahoo.in> wrote:
>
> > Couuld you please help me in understand map reduce in Hadoop?
> >
> >
> >
> > ________________________________
> > From: Mohammad Mustaqeem <3m.mustaqeem@gmail.com>
> > To: common-dev <common-dev@hadoop.apache.org>
> > Sent: Thursday, 18 April 2013 10:44 AM
> > Subject: Re: How to understand Hadoop source code ?
> >
> >
> > I am interested in HDFS. Please guide me.
> >
> >
> > On Thu, Apr 18, 2013 at 3:36 AM, Arun C Murthy <acm@hortonworks.com>
> > wrote:
> >
> > > Please don't cross post.
> > >
> > > What parts of Hadoop are you interested in? HDFS? YARN? MapReduce?
> > >
> > > Arun
> > >
> > > On Apr 17, 2013, at 2:50 PM, Mohammad Mustaqeem wrote:
> > >
> > > > Hello everyone,
> > > >          I am new to this group. Since the source code of Hadoop
> > > > is
> > very
> > > > big, I am not able to understand it entirely.
> > > > Is there any document that describes the code?
> > > > Is there any way to understand the functionality of each classes
> > > > and
> > its
> > > > method?
> > > >
> > > >
> > > > --
> > > > *With regards ---*
> > > > *Mohammad Mustaqeem*,
> > > > M.Tech (CSE)
> > > > MNNIT Allahabad
> > >
> > > --
> > > Arun C. Murthy
> > > Hortonworks Inc.
> > > http://hortonworks.com/
> > >
> > >
> > >
> >
> >
> > --
> > *With regards ---*
> > *Mohammad Mustaqeem*,
> > M.Tech (CSE)
> > MNNIT Allahabad
> > 9026604270
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message