Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
From: Mike Wenzel <mwenzel@proheris.de>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: Looking for documentation/guides on Hadoop 2.7.2
Thread-Topic: Looking for documentation/guides on Hadoop 2.7.2
Thread-Index: AdHCKRtDBCCdQUozR4GjCvFIJxQssAABkYjw
Date: Thu, 9 Jun 2016 09:15:27 +0000
Message-ID: <DFACCAB6686FFB4E953F18E4EAB3BA4C7B896927@LFM15.lfm.proheris.de>
Accept-Language: de-DE, en-US
Content-Language: de-DE
Content-Type: multipart/alternative;
	boundary="_000_DFACCAB6686FFB4E953F18E4EAB3BA4C7B896927LFM15lfmproheri_"
MIME-Version: 1.0
archived-at: Thu, 09 Jun 2016 09:15:41 -0000

--_000_DFACCAB6686FFB4E953F18E4EAB3BA4C7B896927LFM15lfmproheri_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hey everyone. I just started some weeks ago to learn about Hadoop. I got th=
e task to understand the Hadoop Ecosystem, and be able to answer some quest=
ions. First of all I started reading a book "OReilly - Hadoop The Definitiv=
e Guide". After reading the book I had a first idea of how components work =
together, but for me the book didn't helped me to understand what's going o=
n. In my opinion the book described pretty much general in depth details ab=
out various components. This didn't helped me to understand the Hadoop Ecos=
ystem.

I started to work with it. I installed a VM (SUSE Leap 42.1) and followed t=
he https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/=
SingleCluster.html Guide.
After doing this I started to work with files on it. I wrote my first simpl=
e mapper and reducer, and I analyzed my apache log for some testing. This w=
orked good so far.

But let's face my problems:
1) All my knowledge about the Installing of Hadoop right now is: Unpacking =
a .tar.gz. I ran some shell-scripts and everything was running fine. Well, =
I have no clue at all, which components are now installed on the VM and whe=
re are they located and installed?

2) Furthermore, I'm missing all kinds of information about setting those up=
. The apache guide on some point says "Now check that you can ssh to the lo=
calhost without a passphrase" "If you cannot ssh to localhost without a pas=
sphrase, execute the following commands:". Well, I'd like to know what am I=
 doing here ?! I mean WHY do I need ssh running on localhost, and WHY do th=
is have to be without a passphrase. Which other ways of configuring this do=
 exists?

3) Same on the next point: "The following instructions are to run a MapRedu=
ce job locally. If you want to execute a job on YARN, see YARN on Single No=
de." "Format the filesystem: $ bin/hdfs namenode -format". I have no clue h=
ow HDFS internally work. For me a Filesystem is where I can setup partition=
s hooked on folders. So how am I supposed to explain hdfs to someone else?
I understood the storing of data, splitting files in blocks, spread files a=
round the cluster, store metadata, but if someone asks me: "How can this be=
 called filesystem if you install it by unpacking a .tar.gz?" I simply can'=
t answer this question in any way.

So I'm now looking for a documentation/guide for:
- Which requirements do I have?
-- Does I have to use a specific Filesystem? If yes/no, why or what would y=
ou recommend?
-- How should I partition my VM?
-- On which partition should I install which components?
- Setting up a VM with Hadoop
- Configure Hadoop step by step
- Setup all kinds of deamons/nodes manually and explain where are they loca=
ted (how they work) and how they should be configured

I'm right now reading: https://hadoop.apache.org/docs/stable/hadoop-project=
-dist/hadoop-common/ClusterSetup.html but after some first readings this Gu=
ide will tell you what to write in which configuration-file, but now why yo=
u should do this or not. I'm feeling like "leaved alone in the darkness" af=
ter getting an idea of what Hadoop is. I hope some of you can show me some =
ways to get back om the road.
For me it's very important not just to write some configuration somewhere. =
I need to understand what's going on because if I got a running cluster and=
 things, I need to be sure to handle all this stuff before going in product=
ive use with it.

Best Regards
Mike

--_000_DFACCAB6686FFB4E953F18E4EAB3BA4C7B896927LFM15lfmproheri_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;
	mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:#0563C1;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:#954F72;
	text-decoration:underline;}
span.E-MailFormatvorlage17
	{mso-style-type:personal;
	font-family:"Calibri",sans-serif;
	color:black;}
span.E-MailFormatvorlage18
	{mso-style-type:personal-reply;
	font-family:"Calibri",sans-serif;
	color:black;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"DE" link=3D"#0563C1" vlink=3D"#954F72">
<div class=3D"WordSection1">
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">Hey every=
one. I just started some weeks ago to learn about Hadoop. I got the task to=
 understand the Hadoop Ecosystem, and be able to answer some questions. Fir=
st of all I started reading a book &quot;OReilly
 - Hadoop The Definitive Guide&quot;. After reading the book I had a first =
idea of how components work together, but for me the book didn't helped me =
to understand what&#8217;s going on. In my opinion the book described prett=
y much general in depth details about various
 components. This didn't helped me to understand the Hadoop Ecosystem. <o:p=
></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">I started=
 to work with it. I installed a VM (SUSE Leap 42.1) and followed the
<a href=3D"https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop=
-common/SingleCluster.html">
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Sin=
gleCluster.html</a> Guide.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">After doi=
ng this I started to work with files on it. I wrote my first simple mapper =
and reducer, and I analyzed my apache log for some testing. This worked goo=
d so far.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">But let&#=
8217;s face my problem</span><span lang=3D"EN-US" style=3D"color:black">s</=
span><span lang=3D"EN-US" style=3D"color:black">:<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">1) All my=
 knowledge about the Installing of Hadoop right now is: Unpacking a .tar.gz=
. I ran some shell-scripts and everything was running fine. Well, I have no=
 clue at all, which components are now
 installed on the VM and where are they located and installed?<o:p></o:p></=
span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">2) Furthe=
rmore, I'm missing all kinds of information about setting those up. The apa=
che guide on some point says &quot;Now check that you can ssh to the localh=
ost without a passphrase&quot; &quot;If you cannot ssh
 to localhost without a passphrase, execute the following commands:&quot;. =
Well, I'd like to know what am I doing here ?! I mean WHY do I need ssh run=
ning on localhost, and WHY do this have to be without a passphrase. Which o=
ther ways of configuring this do exists?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">3) Same o=
n the next point: &quot;The following instructions are to run a MapReduce j=
ob locally. If you want to execute a job on YARN, see YARN on Single Node.&=
quot; &quot;Format the filesystem: $ bin/hdfs namenode
 -format&quot;. I have no clue how HDFS internally work. For me a Filesyste=
m is where I can setup partitions hooked on folders. So how am I supposed t=
o explain hdfs to someone else?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">I underst=
ood the storing of data, splitting files in blocks, spread files around the=
 cluster, store metadata, but if someone asks me: &quot;How can this be cal=
led filesystem if you install it by unpacking
 a .tar.gz?&quot; I simply can't answer this question in any way.<o:p></o:p=
></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">So I'm no=
w looking for a documentation/guide for:<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">- Which r=
equirements do I have?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">-- Does I=
 have to use a specific Filesystem? If yes/no, why or what would you recomm=
end?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">-- How sh=
ould I partition my VM?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">-- On whi=
ch partition should I install which components?<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">- Setting=
 up a VM with Hadoop<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">- Configu=
re Hadoop step by step<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">- Setup a=
ll kinds of deamons/nodes manually and explain where are they located (how =
they work) and how they should be configured<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">I'm right=
 now reading:
<a href=3D"https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop=
-common/ClusterSetup.html">
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Clu=
sterSetup.html</a> but after some first readings this Guide will tell you w=
hat to write in which configuration-file, but now why you should do this or=
 not. I'm feeling like &quot;leaved alone
 in the darkness&quot; after getting an idea of what Hadoop is. I hope some=
 of you can show me some ways to get back om the road.<o:p></o:p></span></p=
>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black">For me it=
's very important not just to write some configuration somewhere. I need to=
 understand what's going on because if I got a running cluster and things, =
I need to be sure to handle all this stuff
 before going in productive use with it.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span lang=3D"EN-US" style=3D"color:black"><o:p>&nbs=
p;</o:p></span></p>
<p class=3D"MsoNormal"><span style=3D"color:black">Best Regards<o:p></o:p><=
/span></p>
<p class=3D"MsoNormal"><span style=3D"color:black">Mike<o:p></o:p></span></=
p>
</div>
</body>
</html>

--_000_DFACCAB6686FFB4E953F18E4EAB3BA4C7B896927LFM15lfmproheri_--