hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahesh Balija <balijamahesh....@gmail.com>
Subject Re: I am running MapReduce on a 30G data on 1master/2 slave, but failed.
Date Fri, 11 Jan 2013 05:11:43 GMT
Hi,

          2 reducers are successfully completed and 1498 have been killed.
I assume that you have the data issues. (Either the data is huge or some
issues with the data you are trying to process)
          One possibility could be you have many values associated to a
single key, which can cause these kind of issues based on the operation you
do in your reducer.
          Can you put some logs in your reducer and try to trace out what
is happening.

Best,
Mahesh Balija,
Calsoft Labs.

On Fri, Jan 11, 2013 at 8:53 AM, yaotian <yaotian@gmail.com> wrote:

> I have 1 hadoop master which name node locates and 2 slave which datanode
> locate.
>
> If i choose a small data like 200M, it can be done.
>
> But if i run 30G data, Map is done. But the reduce report error. Any
> sugggestion?
>
>
> This is the information.
>
> *Black-listed TaskTrackers:* 1<http://23.20.27.135:9003/jobblacklistedtrackers.jsp?jobid=job_201301090834_0041>
> ------------------------------
> Kind % CompleteNum Tasks PendingRunningComplete KilledFailed/Killed
> Task Attempts<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041>
> map<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=map&pagenum=1>
> 100.00%4500 0450<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=map&pagenum=1&state=completed>
> 00 / 1<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=map&cause=killed>
> reduce<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1>
> 100.00%15000 02<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1&state=completed>
> 1498<http://23.20.27.135:9003/jobtasks.jsp?jobid=job_201301090834_0041&type=reduce&pagenum=1&state=killed>
> 12<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=reduce&cause=failed>
>  / 3<http://23.20.27.135:9003/jobfailures.jsp?jobid=job_201301090834_0041&kind=reduce&cause=killed>
>
>
> TaskCompleteStatusStart TimeFinish TimeErrorsCounters
> task_201301090834_0041_r_000001<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000001>
> 0.00%
> 10-Jan-2013 04:18:54
> 10-Jan-2013 06:46:38 (2hrs, 27mins, 44sec)
>
> Task attempt_201301090834_0041_r_000001_0 failed to report status for 600 seconds. Killing!
> Task attempt_201301090834_0041_r_000001_1 failed to report status for 602 seconds. Killing!
> Task attempt_201301090834_0041_r_000001_2 failed to report status for 602 seconds. Killing!
> Task attempt_201301090834_0041_r_000001_3 failed to report status for 602 seconds. Killing!
>
>
> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000001>
> task_201301090834_0041_r_000002<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000002>
> 0.00%
> 10-Jan-2013 04:18:54
> 10-Jan-2013 06:46:38 (2hrs, 27mins, 43sec)
>
> Task attempt_201301090834_0041_r_000002_0 failed to report status for 601 seconds. Killing!
> Task attempt_201301090834_0041_r_000002_1 failed to report status for 600 seconds. Killing!
>
>
> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000002>
> task_201301090834_0041_r_000003<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000003>
> 0.00%
> 10-Jan-2013 04:18:57
> 10-Jan-2013 06:46:38 (2hrs, 27mins, 41sec)
>
> Task attempt_201301090834_0041_r_000003_0 failed to report status for 602 seconds. Killing!
> Task attempt_201301090834_0041_r_000003_1 failed to report status for 602 seconds. Killing!
> Task attempt_201301090834_0041_r_000003_2 failed to report status for 602 seconds. Killing!
>
>
> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000003>
> task_201301090834_0041_r_000005<http://23.20.27.135:9003/taskdetails.jsp?tipid=task_201301090834_0041_r_000005>
> 0.00%
> 10-Jan-2013 06:11:07
> 10-Jan-2013 06:46:38 (35mins, 31sec)
>
> Task attempt_201301090834_0041_r_000005_0 failed to report status for 600 seconds. Killing!
>
>
> 0<http://23.20.27.135:9003/taskstats.jsp?tipid=task_201301090834_0041_r_000005>
>

Mime
View raw message