Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 916D411E4F for ; Sun, 10 Aug 2014 17:24:13 +0000 (UTC) Received: (qmail 70976 invoked by uid 500); 10 Aug 2014 17:24:07 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 70830 invoked by uid 500); 10 Aug 2014 17:24:06 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 70818 invoked by uid 99); 10 Aug 2014 17:24:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Aug 2014 17:24:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sebastiano.dipaola@gmail.com designates 209.85.218.49 as permitted sender) Received: from [209.85.218.49] (HELO mail-oi0-f49.google.com) (209.85.218.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 10 Aug 2014 17:24:02 +0000 Received: by mail-oi0-f49.google.com with SMTP id u20so4924712oif.22 for ; Sun, 10 Aug 2014 10:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=1MYiPCJP2zyrWDFvN42rPjZ31xfW5i+gD+KFyfeqNMM=; b=KlUbXm+2Tou93aZUN1716tEiNEbAmn4dvWNn58jS2VF/rvv7C3lHOiHCe/F4Adnt/R VNJ5GVvYp4KJexXOuRMWefb0DGmRJdjRskj4lJkwcmvwvBNBXk+FQ4j32bBkUlxmEKqV NTkxc2NFef3ddOBsfTN+4j+H2/2werg6b2UjEFukKyA/5qS/azlKRWNMCZ+FTFIXUzlo nURvWq5dNY/pTezQBsZfUEdJ+VX+2LBl1+TBGazi7i7UgB2ZfsetF+1bm7+P5ughX5Vf JzPafraTr1BKnHYRk6nnyvsg+aGTBeA6/ekbLidjkSARaAEJ9koNOTJ7YsNU7r/LvTpt CwJg== MIME-Version: 1.0 X-Received: by 10.60.135.37 with SMTP id pp5mr6921746oeb.54.1407691421559; Sun, 10 Aug 2014 10:23:41 -0700 (PDT) Received: by 10.76.103.179 with HTTP; Sun, 10 Aug 2014 10:23:41 -0700 (PDT) Date: Sun, 10 Aug 2014 19:23:41 +0200 Message-ID: Subject: Yarn, MRv1, MRv2 lots of newbie doubts and questions From: Sebastiano Di Paola To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b41890922b09b050049b3d5 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b41890922b09b050049b3d5 Content-Type: text/plain; charset=UTF-8 Hi all, I'm a newbie hadoop user, and I started using hadoop 2.4.1 as my first installation. So now I'm struggling with mapred, mapreduce, yarn....MRv1, MRv2, yarn. I tried to read the documentation, but I couldn't find a clear answer...sometimes it seems that documentations thinks that you know all the history about hadoop framework... :( I started with standalone node of course, but I have deployed also a cluster with 10 machines. Start with the example on the documentation. Cluster installed...dfs running with start-dfs.sh when I run bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+' What I'm using? MRv1, MRv2? The job execute successfully and I can get the output on HDFS output directory. Then on the same installation I start yarn with start-yarn.sh I run the same command after starting yarn bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+' So what I'm using in this case? I'm not sure about what is the difference from mapreduce and yarn....probably mapreduce is running on top of yarn? How does mapreduce interact with yarn? it it completely transparent? What's the difference between a mapreduce and a yarn application? (Forgive me if it's not correct to talk about mapreduce application) Besides that...writing a completely new mapreduce application what API that should be used? not to write deprecated/old hadoop style code? mapred or mapreduce Thanks a lot. Kind regards. Seba --047d7b41890922b09b050049b3d5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi all,
I'm a newbie hadoop us= er, and I started using hadoop 2.4.1 as my first installation.
So = now I'm struggling with mapred, mapreduce, yarn....MRv1, MRv2, yarn.
I tried to read the documentation, but I couldn't find a cle= ar answer...sometimes it seems=C2=A0 that documentations thinks that you kn= ow all the history about hadoop framework... :(

I started= with standalone node of course, but I have deployed also a cluster with 10= machines.

Start with the example on the documentation.
<= br>
Cluster installed...dfs running with
start-dfs.sh

=
when I run
bin/hadoop jar share/hadoop/mapreduce/hadoop=
-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
What I'm using? MRv1, MRv2?
The job execute successfully and I can = get the output on HDFS output directory.


Then on the = same installation I start yarn with start-yarn.sh
I run the s= ame command after starting yarn
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.=
jar grep input output 'dfs[a-z.]+'
So what I'm = using in this case?

I'm not sure about what is the di= fference from mapreduce and yarn....probably mapreduce is running on top of= yarn? How does mapreduce interact with yarn? it it completely transparent?=

What's the difference between a mapreduce and a yarn app= lication? (Forgive me if it's not correct to talk about mapreduce appli= cation)

Besides that...writing a completely new mapreduce= application what API that should be used? not to write deprecated/old hado= op style code?
mapred or mapreduce
Thanks a lot.
Kind regards.=
Seba


--047d7b41890922b09b050049b3d5--