hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Bockelman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4351) ArrayIndexOutOfBoundsException during fsck
Date Tue, 07 Oct 2008 23:16:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637696#action_12637696
] 

Brian Bockelman commented on HADOOP-4351:
-----------------------------------------

Hey Hairong,

Unfortunately, the admins moved the namenode to a more permanent home and clobbered the existing
logfiles in the process.  (And I fixed the corrupt blocks: I didn't want to live with a problematic
file system for too long!)

The code printed out the replicas in the blocksMap (node 10, 145, 117) and the corrupt entries
in the corruptReplicas vairable (node16).  The existing code calculates that 3 - 1 = 2 replicas
must be good (this is a mistake as corruptReplicas is not a subset of blocksMap); however,
when it starts to populate the machineSet, it only gets as far as nodes 10 and 145, then throws
the exception on 117.

You're right - there's possibly an underlying problem here which this patches a symptom of.
 However, it still is a useful thing to fix: it is rather painful to go through filesystem
cleanup when fsck dies instantly.

I believe there log message was something like this one:

2008-10-07 05:04:24,021 INFO org.apache.hadoop.dfs.StateChange: BLOCK NameSystem.markBlockAsCorrupt:
block blk_-5420894356244363410_2169 could not be marked as corrupt as it does not exists in
blocksMap

I think if a data node reports it has a corrupt block, then it gets added to the corrupt map
and not the blocksMap.  I took a peek at the underlying issue, and wasn't able to make much
progress - an expert on FSNamesystem will be needed to find the underlying problem.

Brian

PS: I looked at the patch again - silly me leaked several changes I have made into this patch;
please disregard any changes to files that are NOT FSNamesystem.java

> ArrayIndexOutOfBoundsException during fsck
> ------------------------------------------
>
>                 Key: HADOOP-4351
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4351
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: fsck_hadoop_4351.patch
>
>
> After observing a lot of corrupted blocks, I suddenly started to get a lot of ArrayIndexOutOfBoundsException.
> It appears to be an issue very similar to HADOOP-3649, which is supposed to be fixed
in 0.18.1.
> 2008-10-06 08:48:43,241 WARN /: /fsck?path=%2F:
> java.lang.ArrayIndexOutOfBoundsException: 2
>    at org.apache.hadoop.dfs.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:789)
>    at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:727)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:167)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.check(NamenodeFsck.java:162)
>    at org.apache.hadoop.dfs.NamenodeFsck.fsck(NamenodeFsck.java:128)
>    at org.apache.hadoop.dfs.FsckServlet.doGet(FsckServlet.java:48)
>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
>    at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>    at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>    at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>    at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>    at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>    at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>    at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>    at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>    at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>    at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>    at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>    at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message