Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E49B9EEB5 for ; Mon, 27 May 2013 18:23:18 +0000 (UTC) Received: (qmail 39958 invoked by uid 500); 27 May 2013 18:23:14 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 39703 invoked by uid 500); 27 May 2013 18:23:13 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 39683 invoked by uid 99); 27 May 2013 18:23:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 May 2013 18:23:13 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jayunit100@gmail.com designates 209.85.212.47 as permitted sender) Received: from [209.85.212.47] (HELO mail-vb0-f47.google.com) (209.85.212.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 May 2013 18:23:05 +0000 Received: by mail-vb0-f47.google.com with SMTP id x13so4840721vbb.34 for ; Mon, 27 May 2013 11:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date :to; bh=mz22Z5VgKHeOped5W1DfOMq9uImuWQML0SQ7Lov4BEg=; b=riYqy9kefhdmXW3TM5sy8oNsglaFif/yhEi3sSOz4BKdg5fW/rsILXSxNjycyCh0QP S1GzLtq6TeJY3ZZg4iu/Hrj7LLPPnUqeleBTDbhQpWp9SinRUxk4z+VUJUpNg2tO9oov ZPyKlkJ/AfgLFAS89TG0MjWv3+pPQtlVcqsZwv7y9un0nhO2GpoQ5335s0ssyelJy+ij +dRgVZI2qYPsxnDlbM+w7JQW0I6xoLerALXBzfelJJnPzgwiPd2jNEp7qoSBVlreNCpD tGR0ZCuejbDkpzux8p/V2TwdiZPPnCHb4+WhTjZ1pyWXXd1OQATVaiXWlZDwVyZFo77Y aMGg== X-Received: by 10.58.168.208 with SMTP id zy16mr16000341veb.3.1369678965173; Mon, 27 May 2013 11:22:45 -0700 (PDT) Received: from [10.0.1.57] (c-71-235-206-176.hsd1.ct.comcast.net. [71.235.206.176]) by mx.google.com with ESMTPSA id r18sm21217355vdu.10.2013.05.27.11.22.43 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 27 May 2013 11:22:44 -0700 (PDT) References: <1369678211.23053.YahooMailNeo@web163002.mail.bf1.yahoo.com> Mime-Version: 1.0 (1.0) In-Reply-To: <1369678211.23053.YahooMailNeo@web163002.mail.bf1.yahoo.com> Content-Type: multipart/alternative; boundary=Apple-Mail-1AA78B01-3657-4F6C-B704-5124FAD1D9D0 Content-Transfer-Encoding: 7bit Message-Id: Cc: "user@hadoop.apache.org" X-Mailer: iPhone Mail (10B146) From: Jay Vyas Subject: Re: understanding souce code structure Date: Mon, 27 May 2013 14:22:42 -0400 To: "user@hadoop.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-1AA78B01-3657-4F6C-B704-5124FAD1D9D0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Hi! a few weeks ago I had the same question... Tried a first iteration at d= ocumenting this by going through the classes starting with key/value pairs i= n the blog post below. =20 http://jayunit100.blogspot.com/2013/04/the-kv-pair-salmon-run-in-mapreduce-h= dfs.html Note it's not perfect yet but I think it should provide some insight into th= ings. The lynch pin of it all is the DFSOutputStream and the DataStreamer c= lasses. Anyways... Feel free to borrow the contents and roll your own , or= comment on it & leave some feedback,or let me know if anything is missing. = =20 Definetly would be awesome to have a rock solid view of the full write path.= On May 27, 2013, at 2:10 PM, Mahmood Naderan wrote: > Hello >=20 > I am trying to understand the source of of hadoop especially the HDFS. I w= ant to know where should I look exactly in the source code about how HDFS di= stributes the data. Also how the map reduce engine tries to read the data.=20= >=20 >=20 > Any hint regarding the location of those in the source code is appreciated= . > =20 > Regards, > Mahmood --Apple-Mail-1AA78B01-3657-4F6C-B704-5124FAD1D9D0 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi!  a few weeks ago I had the sa= me question... Tried a first iteration at documenting this by going through t= he classes starting with key/value pairs in the blog post below.  


Note it's not per= fect yet but I think it should provide some insight into things.  The l= ynch pin of it all is the DFSOutputStream and the DataStreamer classes. &nbs= p; Anyways... Feel free to borrow the contents and roll your own , or c= omment on it & leave some feedback,or let me know if anything is missing= .   

Definetly would be awesome to have a= rock solid view of the full write path.

On May 27,= 2013, at 2:10 PM, Mahmood Naderan <nt_mahmood@yahoo.com> wrote:

Hello

I am trying to under= stand the source of of hadoop especially the HDFS. I want to know where shou= ld I look exactly in the source code about how HDFS distributes the data. Al= so how the map reduce engine tries to read the data.


Any hint re= garding the location of those in the source code is appreciated.
&= nbsp;
Regards,
Mahmood

= --Apple-Mail-1AA78B01-3657-4F6C-B704-5124FAD1D9D0--