hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: Slow reduce task
Date Mon, 01 Oct 2007 16:47:10 GMT

You seem to have no data in your cluster.  I wouldn't think that would cause
the hang that you observed, but it does limit how useful the cluster is.


On 10/1/07 9:42 AM, "Ming Yang" <minghsien@gmail.com> wrote:

> Below is the output form hadoop fsck / :
> 
> Status: HEALTHY
>  Total size:    0 B
>  Total blocks:  0
>  Total dirs:    6
>  Total files:   0
>  Over-replicated blocks:        0
>  Under-replicated blocks:       0
>  Target replication factor:     2
>  Real replication factor:       0.0
> 
> 
> The filesystem under path '/' is HEALTHY.
> 
> ************************
> 
> I am also wondering that, according to Google's paper about MapReduce,
> if there's any node failure or not responding for a given amount of time,
> the master will reassign the job to the other nodes. Is it true in Hadoop's
> implementation? Since I didn't see any job reassignment when a job
> has been pending too long.
> 
> Thanks,
> 
> Ming
> 
> 2007/10/1, Ted Dunning <tdunning@veoh.com>:
>> 
>> What does [hadoop fsck /] show?
>> 
>> 
>> On 10/1/07 5:36 AM, "Ming Yang" <minghsien@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> I am using hadoop 0.14.1 on Ubuntu 7.04 (kernel version 2.6.20)
>>> The Java version is 1.5.0.12. There are no failed tasks and no
>>> lost task trackers.. What I observed is the machine only finished
>>> part of the reduce tasks and became idle. Could the issue come
>>> from my HDFS since the status showed the transfer rate is so low?
>>> 
>>> Thanks,
>>> 
>>> Ming
>>> 
>>> 
>>> 2007/9/30, Arun C Murthy <arunc@yahoo-inc.com>:
>>>> Ming Yang,
>>>> 
>>>> On Sun, Sep 30, 2007 at 01:13:07PM -0400, Ming Yang wrote:
>>>>> Hi,
>>>>> 
>>>>> I set up a 2-node Hadoop cluster, whose nodes are all in
>>>>> the same network and ran the 'grep' example. The map tasks
>>>>> were distributed among the two machines and ran without any
>>>>> problem. However, the reduce task, which is running at the slave
>>>>> node, doesn't seem to finish and stops at 11%. I checked the
>>>>> reduce task tracker and it shows the following message:
>>>>> 
>>>>> reduce > copy (5 of 15 at 0.00 MB/s) >
>>>>> 
>>>>> Can anyone let me know where the problem comes from?
>>>>> and how to fix it? I really appreciate it!
>>>>> 
>>>> 
>>>> Could you provide details on the hadoop version, platform etc.? Were there
>>>> any failed tasks, lost task-trackers?
>>>> 
>>>> Arun
>>>> 
>>>>> Thank you,
>>>>> 
>>>>> Ming Yang
>>>> 
>> 
>> 


Mime
View raw message