Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of sudhakara.st@gmail.com
 designates 209.85.214.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <20140127053904.7426edd2@gmail.com>
References: <20140127053904.7426edd2@gmail.com>
Date: Mon, 27 Jan 2014 17:27:39 +0530
Message-ID: 
 <CAPE0Onigqz6rv6aE8hUeC9k-MLoaJbqUhZssdoPSo1iZgkzQCQ@mail.gmail.com>
Subject: Re: Performance in running jobs at the same time
From: sudhakara st <sudhakara.st@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a11c2d3b6186d6604f0f26a0c

--001a11c2d3b6186d6604f0f26a0c
Content-Type: text/plain; charset=ISO-8859-1

1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
running, I try to list them with "hadoop jobs -list", but it takes lots
of time for the command being executed. This happens because of the
performance of the VM. I just wonder how it works with big machines.
Does anyone have an idea if it takes long to launch Hadoop commands
while executing jobs.


*>> Get job information involves  communication with resource
mager/application Master. Because of available resource(CPU,Memory) in your
VM is too less. may hadoop command taking long time to get job information.*2
- I want to run several jobs at the same time. How can I configure
the maximum number of jobs that I can run at the same time?

*>> Once you submit you job to RM, scheduler will decide how to run your
job based on scheduler you used to run jobs and resource availability in
your cluster. you have to write or customize scheduler to control the
submission order or number of jobs to run at any instance. *

3 - Is there a calculation of how many jobs I can run at the same time
for specific environment similar to how many reduces should we set in
our jobs?

*>> If you  have clear idea about how much of data your going process in
your jobs, how much of resource it going to use, how much of total resource
available in cluster then you can define how many jobs can run at instance
of time. It possible when are going handle only fixed data set in all
cycles, in  real environment it not possible calculate  these thing for
each job in each run.** In hadoop2 RM takes care all resource mangemnt, you
need not to take special care about all these things. if need ordere
process of jobs then you look no Oozie kind of tool to control over order
of MR jobs.*


On Mon, Jan 27, 2014 at 11:09 AM, xeon <xeonmailinglist@gmail.com> wrote:

> Hi,
>
> 1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are
> running, I try to list them with "hadoop jobs -list", but it takes lots
> of time for the command being executed. This happens because of the
> performance of the VM. I just wonder how it works with big machines.
> Does anyone have an idea if it takes long to launch Hadoop commands
> while executing jobs?
>
>
> 2 - I want to run several jobs at the same time. How can I configure
> the maximum number of jobs that I can run at the same time?
>
>
> 3 - Is there a calculation of how many jobs I can run at the same time
> for specific environment similar to how many reduces should we set in
> our jobs?
>
> Thanks,
>
> --
> Best regards,
>


-- 

Regards,
...Sudhakara.st

--001a11c2d3b6186d6604f0f26a0c
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div><div><br>
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are<br>
running, I try to list them with &quot;hadoop jobs -list&quot;, but it take=
s lots<br>
of time for the command being executed. This happens because of the<br>
performance of the VM. I just wonder how it works with big machines.<br>
Does anyone have an idea if it takes long to launch Hadoop commands<br>
while executing jobs.<br><span style=3D"color:rgb(11,83,148)"><br></span></=
div><span style=3D"color:rgb(11,83,148)"><b>&gt;&gt; Get job information in=
volves=A0 communication with resource mager/application Master. Because of =
available resource(CPU,Memory) in your VM is too less. may hadoop command t=
aking long time to get job information.<br>
<br></b></span>2 - I want to run several jobs at the same time. How can I c=
onfigure<br>
the maximum number of jobs that I can run at the same time?<br><br></div><b=
><span style=3D"color:rgb(11,83,148)">&gt;&gt; Once you submit you job to R=
M, scheduler will decide how to run your job based on scheduler you used to=
 run jobs and resource availability in your cluster. you have to write or c=
ustomize scheduler to control the submission order or number of jobs to run=
 at any instance. </span></b><br>
<br>3 - Is there a calculation of how many jobs I can run at the same time<=
br>
for specific environment similar to how many reduces should we set in<br>
our jobs?<br><span style=3D"color:rgb(11,83,148)"><br></span></div><b><span=
 style=3D"color:rgb(11,83,148)">&gt;&gt; If you=A0 have clear idea about ho=
w much of data your going process in your jobs, how much of resource it goi=
ng to use, how much of total resource available in cluster then you can def=
ine how many jobs can run at instance of time. It possible when are going h=
andle only fixed data set in all cycles, in=A0 real environment it not poss=
ible calculate=A0 these thing for each job in each run.</span></b><b><span =
style=3D"color:rgb(11,83,148)"> In hadoop2 RM takes care all resource mange=
mnt, you need not to take special care about all these things. if need orde=
re process of jobs then you look no <span style=3D"color:rgb(76,17,48)">Ooz=
ie</span> kind of tool to control over order of MR jobs.</span></b><br>
</div></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">O=
n Mon, Jan 27, 2014 at 11:09 AM, xeon <span dir=3D"ltr">&lt;<a href=3D"mail=
to:xeonmailinglist@gmail.com" target=3D"_blank">xeonmailinglist@gmail.com</=
a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi,<br>
<br>
1 - I installed Hadoop MRv2 in VirtualMachines. When the jobs are<br>
running, I try to list them with &quot;hadoop jobs -list&quot;, but it take=
s lots<br>
of time for the command being executed. This happens because of the<br>
performance of the VM. I just wonder how it works with big machines.<br>
Does anyone have an idea if it takes long to launch Hadoop commands<br>
while executing jobs?<br>
<br>
<br>
2 - I want to run several jobs at the same time. How can I configure<br>
the maximum number of jobs that I can run at the same time?<br>
<br>
<br>
3 - Is there a calculation of how many jobs I can run at the same time<br>
for specific environment similar to how many reduces should we set in<br>
our jobs?<br>
<br>
Thanks,<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--<br>
Best regards,<br>
</font></span></blockquote></div><br><br clear=3D"all"><br>-- <br><div dir=
=3D"ltr">=A0 =A0 =A0=A0 <br><span style=3D"color:rgb(11,83,148)">Regards,</=
span><br style=3D"color:rgb(11,83,148)"><span style=3D"color:rgb(11,83,148)=
"><span style=3D"color:rgb(255,0,255)">...</span>Sudhakara.st</span><br>
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 <br></div>
</div>

--001a11c2d3b6186d6604f0f26a0c--