hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Noguchi <knogu...@yahoo-inc.com>
Subject Re: hadoop-1.0.0 and errors with log.index
Date Tue, 31 Jan 2012 18:59:35 GMT
On our cluster, it usually happen when jvm crash with invalid jvm params or
jni crashing at init phase.

stderr/stdout files are created but log.index does not exist when this
happens.

We should fix this.

Koji



On 1/31/12 10:49 AM, "Markus Jelsma" <markus.jelsma@openindex.io> wrote:

> Yes, the stacktrace in my previous message is from the task tracker. It seems
> to happen when there is no data locality for the mapper and it needs to get it
> from some other datanode. The number of failures is the same as the number of
> rack-local mappers.
> 
>> Anything in TaskTracker logs ?
>> 
>> On Jan 31, 2012, at 10:18 AM, Markus Jelsma wrote:
>>> In our case, which seems to be the same problem, the web UI does not show
>>> anything useful except the first line of the stack trace:
>>> 
>>> 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to
>>> retrieve stdout log for task: attempt_201201031651_0008_m_000233_0
>>> 
>>> Only the task tracker log shows a full stack trace. This happened on
>>> 1.0.0 and 0.20.205.0 but not 0.20.203.0.
>>> 
>>> 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to
>>> retrieve stdout log for task: attempt_201201031651_0008_m_000233_0
>>> java.io.FileNotFoundException:
>>> /opt/hadoop/hadoop-0.20.205.0/libexec/../logs/userlogs/job_201201031651_0
>>> 008/attempt_201201031651_0008_m_000233_0/log.index (No such file or
>>> directory)
>>> at java.io.FileInputStream.open(Native Method)
>>> at java.io.FileInputStream.(SecureIOUtils.java:102)
>>> at
>>> org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187)
>>> at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLogServlet.java:81)
>>> at
>>> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>>> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>>> at
>>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>>> at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
>>> ler.java:1221) at
>>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.
>>> java:835) at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
>>> ler.java:1212) at
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>>> at
>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
>>> 6) at
>>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>>> at
>>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>>> at
>>> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>>> at
>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
>>> ollection.java:230) at
>>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>> at org.mortbay.jetty.Server.handle(Server.java:326)
>>> at
>>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>>> at
>>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnec
>>> tion.java:928)
>>> 
>>>> Actually, all that is telling you is that the task failed and the
>>>> job-client couldn't display the logs.
>>>> 
>>>> Can you check the JT web-ui and see why the task failed ?
>>>> 
>>>> If you don't see anything there, you can try see the TaskTracker logs on
>>>> the node on which the task ran.
>>>> 
>>>> Arun
>>>> 
>>>> On Jan 31, 2012, at 3:21 AM, Marcin Cylke wrote:
>>>>> Hi
>>>>> 
>>>>> I've upgraded my hadoop cluster to version 1.0.0. The upgrade process
>>>>> went relatively smoothly but it rendered the cluster inoperable due to
>>>>> errors in jobtrackers operation:
>>>>> 
>>>>> # in job output
>>>>> Error reading task
>>>>> outputhttp://hadoop4:50060/tasklog?plaintext=true&attemptid=attempt_201
>>>>> 20 1311241_0003_m_000004_2&filter=stdout
>>>>> 
>>>>> # in each of the jobtrackers' logs
>>>>> WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stderr log
>>>>> for task: attempt_201201311241_0003_r_000000_1
>>>>> java.io.FileNotFoundException:
>>>>> /usr/lib/hadoop-1.0.0/libexec/../logs/userlogs/job_201201311241_0003/at
>>>>> te mpt_201201311241_0003_r_000000_1/log.index (No such file or
>>>>> directory)
>>>>> 
>>>>>        at java.io.FileInputStream.open(Native Method)
>>>>> 
>>>>> These errors seem related to this two problems:
>>>>> 
>>>>> http://grokbase.com/t/hadoop.apache.org/mapreduce-user/2012/01/error-re
>>>>> ad
>>>>> ing-task-output-and-log-filenotfoundexceptions/03mjwctewcnxlgp2jkcrhvs
>>>>> gep 4e
>>>>> 
>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-2846
>>>>> 
>>>>> But I've looked into the source code and the fix from MAPREDUCE-2846
is
>>>>> there. Perhaps there is some other reason?
>>>>> 
>>>>> Regards
>>>>> Marcin
>>>> 
>>>> --
>>>> Arun C. Murthy
>>>> Hortonworks Inc.
>>>> http://hortonworks.com/
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/


Mime
View raw message