hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yunhong Gu1 <y...@cs.uic.edu>
Subject Re: Reduce hangs
Date Fri, 18 Jan 2008 22:13:03 GMT

Hi, Miles,

Thanks for your information. I applied this but the problem still exists. 
By the way, when this happens, the CPUs are idle and doing nothing.

Yunhong

On Fri, 18 Jan 2008, Miles Osborne wrote:

> I had the same problem.  If I recall, the fix is to add the following to
> your hadoop-site.xml file:
>
> <property>
> <name>mapred.reduce.copy.backoff</name>
> <value>5</value>
> </property>
>
> See hadoop-1984
>
> Miles
>
>
> On 18/01/2008, Yunhong Gu1 <ygu1@cs.uic.edu> wrote:
>>
>>
>> Hi,
>>
>> If someone knows how to fix the problem described below, please help me
>> out. Thanks!
>>
>> I am testing Hadoop on 2-node cluster and the "reduce" always hangs at
>> some stage, even if I use different clusters. My OS is Debian Linux kernel
>> 2.6 (AMD Opteron w/ 4GB Mem). Hadoop verision is 0.15.2. Java version is
>> 1.5.0_01-b08.
>>
>> I simply tried "./bin/hadoop jar hadoop-0.15.2-test.jar mrbench" and when
>> the map stage finishes, the reduce stage will hang somewhere in the
>> middle, sometimes at 0%. I also tried any other mapreduce program I can
>> find in the example jar package but they all hang.
>>
>> The log file simply print
>> 2008-01-18 15:15:50,831 INFO org.apache.hadoop.mapred.TaskTracker:
>> task_200801181424_0004_r_000000_0 0.0% reduce > copy >
>> 2008-01-18 15:15:56,841 INFO org.apache.hadoop.mapred.TaskTracker:
>> task_200801181424_0004_r_000000_0 0.0% reduce > copy >
>> 2008-01-18 15:16:02,850 INFO org.apache.hadoop.mapred.TaskTracker:
>> task_200801181424_0004_r_000000_0 0.0% reduce > copy >
>>
>> forever.
>>
>> The program does work if I start Hadoop only on single node.
>>
>> Below is my hadoop-site.xml configuration:
>>
>> <configuration>
>>
>> <property>
>>     <name>fs.default.name</name>
>>     <value>10.0.0.1:60000</value>
>> </property>
>>
>> <property>
>>     <name>mapred.job.tracker</name>
>>     <value>10.0.0.1:60001</value>
>> </property>
>>
>> <property>
>>     <name>dfs.data.dir</name>
>>     <value>/raid/hadoop/data</value>
>> </property>
>>
>> <property>
>>     <name>mapred.local.dir</name>
>>     <value>/raid/hadoop/mapred</value>
>> </property>
>>
>> <property>
>>    <name>hadoop.tmp.dir</name>
>>    <value>/raid/hadoop/tmp</value>
>> </property>
>>
>> <property>
>>    <name>mapred.child.java.opts</name>
>>    <value>-Xmx1024m</value>
>> </property>
>>
>> <property>
>>    <name>mapred.tasktracker.tasks.maximum</name>
>>    <value>4</value>
>> </property>
>>
>> <!--
>> <property>
>>    <name>mapred.map.tasks</name>
>>    <value>7</value>
>> </property>
>>
>> <property>
>>    <name>mapred.reduce.tasks</name>
>>    <value>3</value>
>> </property>
>> -->
>>
>> <property>
>>    <name>fs.inmemory.size.mb</name>
>>    <value>200</value>
>> </property>
>>
>> <property>
>>    <name>dfs.block.size</name>
>>    <value>134217728</value>
>> </property>
>>
>> <property>
>>    <name>io.sort.factor</name>
>>    <value>100</value>
>> </property>
>>
>> <property>
>>    <name>io.sort.mb</name>
>>    <value>200</value>
>> </property>
>>
>> <property>
>>    <name>io.file.buffer.size</name>
>>    <value>131072</value>
>> </property>
>>
>> </configuration>
>>
>>
>

Mime
View raw message