Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of yypvsxf19870706@gmail.com
 designates 209.85.210.51 as permitted sender)
Subject: Re: Uber Job!
References: 
 <CAO7hTbPEjOzNMygaWbHaEyzHvRdTFkjy6sKFrytrPOYUDWwOtg@mail.gmail.com>
From: yypvsxf19870706 <yypvsxf19870706@gmail.com>
Content-Type: multipart/alternative;
	boundary=Apple-Mail-1FB6E353-A049-46FE-B60D-6F84CE7A3FFD
In-Reply-To: 
 <CAO7hTbPEjOzNMygaWbHaEyzHvRdTFkjy6sKFrytrPOYUDWwOtg@mail.gmail.com>
Message-Id: <0EFB395D-6BB2-4362-91EF-E1B9CCD23E3D@gmail.com>
Date: Mon, 6 May 2013 23:25:52 +0800
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (1.0)


--Apple-Mail-1FB6E353-A049-46FE-B60D-6F84CE7A3FFD
Content-Type: text/plain;
	charset=GB2312
Content-Transfer-Encoding: quoted-printable

Hi

    Suppose that your input file are 10 with total size 64mb , I think you w=
ill get the 10 maps.

    By the ways,the uber mode is only for yarn . Suppose you have actually 1=
 map ,yarn will at least create two containers , one for app master and the o=
ther for the map , if uber mode is enabled with the yarn , yarn will only cr=
eate 1 container for both app master and the map.=20
   =20

=B7=A2=D7=D4=CE=D2=B5=C4 iPhone

=D4=DA 2013-5-6=A3=AC22:45=A3=ACRahul Bhattacharjee <rahul.rec.dgp@gmail.com=
> =D0=B4=B5=C0=A3=BA

> Hi,
>=20
> I was going through the definition of Uber Job of Hadoop.
>=20
> A job is considered uber when it has 10 or less maps , one reducer and the=
 complete data is less than one dfs block size.
>=20
> I have some doubts here-
>=20
> Splits are created as per the dfs block size.Creating 10 mappers are possi=
ble from one block of data by some settings change (changing the max split s=
ize). But trying to understand , why would some job need to run around 10 ma=
ps for 64 MB of data.
> One thing may be that the job is immensely CUP intensive. Will it be a cor=
rect assumption? or is there is any other reason for this.
>=20
> Thanks,
> Rahul
>=20
>=20

--Apple-Mail-1FB6E353-A049-46FE-B60D-6F84CE7A3FFD
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=3D=
utf-8"></head><body dir=3D"auto"><div>Hi</div><div><br></div><div>&nbsp; &nb=
sp; Suppose that your input file are 10 with total size 64mb , I think you w=
ill get the 10 maps.</div><div><br></div><div>&nbsp; &nbsp; By the ways,the u=
ber mode is only for yarn . Suppose you have actually 1 map ,yarn will at le=
ast create two containers , one for app master and the other for the map , i=
f uber mode is enabled with the yarn , yarn will only create 1 container for=
 both app master and the map.&nbsp;</div><div>&nbsp; &nbsp;&nbsp;<br><br>=E5=
=8F=91=E8=87=AA=E6=88=91=E7=9A=84 iPhone</div><div><br>=E5=9C=A8 2013-5-6=EF=
=BC=8C22:45=EF=BC=8CRahul Bhattacharjee &lt;<a href=3D"mailto:rahul.rec.dgp@=
gmail.com">rahul.rec.dgp@gmail.com</a>&gt; =E5=86=99=E9=81=93=EF=BC=9A<br><b=
r></div><blockquote type=3D"cite"><div><div dir=3D"ltr"><div class=3D"gmail_=
default" style=3D"font-family:courier new,monospace;color:rgb(0,0,0)">Hi,<br=
><br></div><div class=3D"gmail_default" style=3D"font-family:courier new,mon=
ospace;color:rgb(0,0,0)">I was going through the definition of Uber Job of H=
adoop.<br>

<br></div><div class=3D"gmail_default" style=3D"font-family:courier new,mono=
space;color:rgb(0,0,0)">A job is considered uber when it has 10 or less maps=
 , one reducer and the complete data is less than one dfs block size.<br>

<br></div><div class=3D"gmail_default" style=3D"font-family:courier new,mono=
space;color:rgb(0,0,0)">I have some doubts here-<br><br></div><div class=3D"=
gmail_default" style=3D"font-family:courier new,monospace;color:rgb(0,0,0)">=

Splits are created as per the dfs block size.Creating 10 mappers are possibl=
e from one block of data by some settings change (changing the max split siz=
e). But trying to understand , why would some job need to run around 10 maps=
 for 64 MB of data.<br>

</div><div class=3D"gmail_default" style=3D"font-family:courier new,monospac=
e;color:rgb(0,0,0)">One thing may be that the job is immensely CUP intensive=
. Will it be a correct assumption? or is there is any other reason for this.=
<br>

<br>Thanks,<br>Rahul<br></div><div class=3D"gmail_default" style=3D"font-fam=
ily:courier new,monospace;color:rgb(0,0,0)"><br></div><div class=3D"gmail_de=
fault" style=3D"font-family:courier new,monospace;color:rgb(0,0,0)"><br></di=
v>

</div>
</div></blockquote></body></html>=

--Apple-Mail-1FB6E353-A049-46FE-B60D-6F84CE7A3FFD--