hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Fuchs (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5084) Reduce output data is not written to disk
Date Tue, 20 Jan 2009 12:38:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael Fuchs updated HADOOP-5084:
----------------------------------

    Description: 
I run into a critical issue with Hadoop 18.2 on my Linux boxes:

The jobs executes without any complains and they are listed in the
succeeded list but there is no output data beside the "_logs" directory.
The same code works with .17.2.1
 

Here are some sections of the logs:

[logfile]
hadoop@bock:~/logs$ tail hadoop-hadoop-jobtracker-bock.log

2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing a data-local task task_200812231229_0031_m_000001 for
speculation

2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200812231229_0031_m_000001_1' to tip
task_200812231229_0031_m_000001, for tracker
'tracker_bock:localhost/127.0.0.1:15260'

2008-12-23 13:31:01,065 INFO org.apache.hadoop.mapred.JobInProgress:
Task 'attempt_200812231229_0031_m_000001_1' has completed
task_200812231229_0031_m_000001 successfully.

2008-12-23 13:31:03,177 INFO org.apache.hadoop.mapred.TaskRunner: Saved
output of task 'attempt_200812231229_0031_r_000000_0' to
hdfs://BOCK:9000/ana/oiprocessed/2008/12/23/Sen1/92a74190-2038-4c79-82c4-2de6fdc615db

[/logfile]

But the folder contains only a "_logs" folder which has a history file
which contains:

[logfile]

Job JOBID="job_200812231415_0001" FINISH_TIME="1230038377844"
JOB_STATUS="SUCCESS" FINISHED_MAPS="2" FINISHED_REDUCES="1"
FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Data-local
map tasks:2,Job Counters .Launched reduce tasks:1,Job Counters .Launched
map tasks:3,Map-Reduce Framework.Reduce input records:61,Map-Reduce
Framework.Map output records:61,Map-Reduce Framework.Map output
bytes:7194,Map-Reduce Framework.Combine output records:0,Map-Reduce
Framework.Map input records:61,Map-Reduce Framework.Reduce input
groups:12,Map-Reduce Framework.Combine input records:0,Map-Reduce
Framework.Map input bytes:36396,Map-Reduce Framework.Reduce output
records:12,File Systems.HDFS bytes written:1533,File Systems.Local bytes
written:14858,File Systems.HDFS bytes read:38679,File Systems.Local
bytes
read:7388,com..ana.scheduling.HadoopTask$Counter.MAPPEED:61
"
[/logfile]

So what I see is that the system runs successful and it even says it
writes data! ("Map-Reduce Framework.Reduce output records:12,File Systems.HDFS bytes written:1533")

If I run the same code with .17.2.1 or in local mode with .18.2 it works
and I get a part-0000 file with the expected data.
 

Please tell me if you need additional information.



  was:
I run into an critical issues with Hadoop 18.2 on my Linux boxes:

The jobs executes without any complains and they are listed in the
succeeded list but there is no output data beside the "_logs" directory.
The same code works with .17.2.1
 

Here are some sections of the logs:

[logfile]
hadoop@bock:~/logs$ tail hadoop-hadoop-jobtracker-bock.log

2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobInProgress:
Choosing a data-local task task_200812231229_0031_m_000001 for
speculation

2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobTracker: Adding
task 'attempt_200812231229_0031_m_000001_1' to tip
task_200812231229_0031_m_000001, for tracker
'tracker_bock:localhost/127.0.0.1:15260'

2008-12-23 13:31:01,065 INFO org.apache.hadoop.mapred.JobInProgress:
Task 'attempt_200812231229_0031_m_000001_1' has completed
task_200812231229_0031_m_000001 successfully.

2008-12-23 13:31:03,177 INFO org.apache.hadoop.mapred.TaskRunner: Saved
output of task 'attempt_200812231229_0031_r_000000_0' to
hdfs://BOCK:9000/ana/oiprocessed/2008/12/23/Sen1/92a74190-2038-4c79-82c4-2de6fdc615db

[/logfile]

But the folder contains only a "_logs" folder which has a history file
which contains:

[logfile]

Job JOBID="job_200812231415_0001" FINISH_TIME="1230038377844"
JOB_STATUS="SUCCESS" FINISHED_MAPS="2" FINISHED_REDUCES="1"
FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Data-local
map tasks:2,Job Counters .Launched reduce tasks:1,Job Counters .Launched
map tasks:3,Map-Reduce Framework.Reduce input records:61,Map-Reduce
Framework.Map output records:61,Map-Reduce Framework.Map output
bytes:7194,Map-Reduce Framework.Combine output records:0,Map-Reduce
Framework.Map input records:61,Map-Reduce Framework.Reduce input
groups:12,Map-Reduce Framework.Combine input records:0,Map-Reduce
Framework.Map input bytes:36396,Map-Reduce Framework.Reduce output
records:12,File Systems.HDFS bytes written:1533,File Systems.Local bytes
written:14858,File Systems.HDFS bytes read:38679,File Systems.Local
bytes
read:7388,com..ana.scheduling.HadoopTask$Counter.MAPPEED:61
"
[/logfile]

So what I see is that the system runs successful and it even says it
writes data! ("Map-Reduce Framework.Reduce output records:12,File Systems.HDFS bytes written:1533")

If I run the same code with .17.2.1 or in local mode with .18.2 it works
and I get a part-0000 file with the expected data.
 

Please tell me if you need additional information.




> Reduce output data is not written to disk
> -----------------------------------------
>
>                 Key: HADOOP-5084
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5084
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.2
>         Environment: Linux version 2.6.22-12-generic (buildd@vernadsky) (gcc version
4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)) #1 SMP Sun Sep 23 18:11:30 GMT 2007
running Hadoop 18.2 on two nodes
>            Reporter: Michael Fuchs
>            Priority: Critical
>
> I run into a critical issue with Hadoop 18.2 on my Linux boxes:
> The jobs executes without any complains and they are listed in the
> succeeded list but there is no output data beside the "_logs" directory.
> The same code works with .17.2.1
>  
> Here are some sections of the logs:
> [logfile]
> hadoop@bock:~/logs$ tail hadoop-hadoop-jobtracker-bock.log
> 2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing a data-local task task_200812231229_0031_m_000001 for
> speculation
> 2008-12-23 13:30:56,707 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task 'attempt_200812231229_0031_m_000001_1' to tip
> task_200812231229_0031_m_000001, for tracker
> 'tracker_bock:localhost/127.0.0.1:15260'
> 2008-12-23 13:31:01,065 INFO org.apache.hadoop.mapred.JobInProgress:
> Task 'attempt_200812231229_0031_m_000001_1' has completed
> task_200812231229_0031_m_000001 successfully.
> 2008-12-23 13:31:03,177 INFO org.apache.hadoop.mapred.TaskRunner: Saved
> output of task 'attempt_200812231229_0031_r_000000_0' to
> hdfs://BOCK:9000/ana/oiprocessed/2008/12/23/Sen1/92a74190-2038-4c79-82c4-2de6fdc615db
> [/logfile]
> But the folder contains only a "_logs" folder which has a history file
> which contains:
> [logfile]
> Job JOBID="job_200812231415_0001" FINISH_TIME="1230038377844"
> JOB_STATUS="SUCCESS" FINISHED_MAPS="2" FINISHED_REDUCES="1"
> FAILED_MAPS="0" FAILED_REDUCES="0" COUNTERS="Job Counters .Data-local
> map tasks:2,Job Counters .Launched reduce tasks:1,Job Counters .Launched
> map tasks:3,Map-Reduce Framework.Reduce input records:61,Map-Reduce
> Framework.Map output records:61,Map-Reduce Framework.Map output
> bytes:7194,Map-Reduce Framework.Combine output records:0,Map-Reduce
> Framework.Map input records:61,Map-Reduce Framework.Reduce input
> groups:12,Map-Reduce Framework.Combine input records:0,Map-Reduce
> Framework.Map input bytes:36396,Map-Reduce Framework.Reduce output
> records:12,File Systems.HDFS bytes written:1533,File Systems.Local bytes
> written:14858,File Systems.HDFS bytes read:38679,File Systems.Local
> bytes
> read:7388,com..ana.scheduling.HadoopTask$Counter.MAPPEED:61
> "
> [/logfile]
> So what I see is that the system runs successful and it even says it
> writes data! ("Map-Reduce Framework.Reduce output records:12,File Systems.HDFS bytes
written:1533")
> If I run the same code with .17.2.1 or in local mode with .18.2 it works
> and I get a part-0000 file with the expected data.
>  
> Please tell me if you need additional information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message