Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of hemanty@thoughtworks.com
 designates 74.125.149.153 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <5D4FB99D-E56C-4C59-B2ED-D12B4128CE14@gmail.com>
References: <5D4FB99D-E56C-4C59-B2ED-D12B4128CE14@gmail.com>
Date: Fri, 5 Oct 2012 09:51:15 +0530
Message-ID: 
 <CAEAKFL_1CEEBB6mABXoEk16zLB+fRHLkyJZncag3=-cbNihofg@mail.gmail.com>
Subject: Re: Question about how to find which file takes the longest time to
 process and how to assign more mappers to process that particular file
From: Hemanth Yamijala <yhemanth@thoughtworks.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=20cf307f37b4e5520904cb4833fc

--20cf307f37b4e5520904cb4833fc
Content-Type: text/plain; charset=ISO-8859-1

Hi,

Roughly, this information will be available under the 'Hadoop map task
list' page in the Mapreduce web ui (in Hadoop-1.0, which I am assuming is
what you are using). You can reach this page by selecting the running tasks
link from the job information page. The page has a table that lists all the
tasks and under the status column tells you which part of the input is
being processed. Please note that, depending on the input format chosen, a
task may be processing a *part* of a file, and not necessary a file itself.

Another good source of information to see why these particular tasks are
slow will be to look at the job's counters. Again these counters can be
accessed from the web ui of the task list page.

It would help more if you can provide more information - like what job
you're trying to run, the input format specified etc.

Thanks
hemanth

On Fri, Oct 5, 2012 at 3:33 AM, Huanchen Zhang <iamzhanghc@gmail.com> wrote:

> Hello,
>
> I have a question about how to find which file takes the longest time to
> process and how to assign more mappers to process that particular file.
>
> Currently, about three mapper takes about five times more time to
> complete. So, how can I detect which specific files are those three mapper
> are processing? If above if doable, how can I assign more mappers to
> process those specific files?
>
> Thank you !
>
> Best,
> Huanchen

--20cf307f37b4e5520904cb4833fc
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,<div><br></div><div>Roughly, this information will be available under th=
e &#39;Hadoop map task list&#39; page in the Mapreduce web ui (in Hadoop-1.=
0, which I am assuming is what you are using). You can reach this page by s=
electing the running tasks link from the job information page. The page has=
 a table that lists all the tasks and under the status column tells you whi=
ch part of the input is being processed. Please note that, depending on the=
 input format chosen, a task may be processing a *part* of a file, and not =
necessary a file itself.</div>
<div><br></div><div>Another good source of information to see why these par=
ticular tasks are slow will be to look at the job&#39;s counters. Again the=
se counters can be accessed from the web ui of the task list page.</div>
<div><br></div><div>It would help more if you can provide more information =
- like what job you&#39;re trying to run, the input format specified etc.</=
div><div><br></div><div>Thanks</div><div>hemanth<br><br><div class=3D"gmail=
_quote">
On Fri, Oct 5, 2012 at 3:33 AM, Huanchen Zhang <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:iamzhanghc@gmail.com" target=3D"_blank">iamzhanghc@gmail.com</a=
>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hello,<br>
<br>
I have a question about how to find which file takes the longest time to pr=
ocess and how to assign more mappers to process that particular file.<br>
<br>
Currently, about three mapper takes about five times more time to complete.=
 So, how can I detect which specific files are those three mapper are proce=
ssing? If above if doable, how can I assign more mappers to process those s=
pecific files?<br>

<br>
Thank you !<br>
<br>
Best,<br>
Huanchen</blockquote></div><br></div>

--20cf307f37b4e5520904cb4833fc--