hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1765) IOException on close: Unknown file
Date Thu, 23 Aug 2007 19:06:30 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522227
] 

stack commented on HADOOP-1765:
-------------------------------

Thanks Raghu.  I upped priority from minor to major since this issue results in data loss.
 What strikes me as odd is that the FSDirectory.addFile log happens at the end of the addBlocks
method.  At the head of this method it does the lookup of the associated node and succeeds.
 When the same addBlocks method is called out of the completeFile as part of the close, it
fails to find the node.

Here is another example of the failure done with an update to TRUNK -- turns out the above
log maybe a week or more old -- and a newly formatted filesystem:
{code}
2007-08-23 16:16:22,647 DEBUG org.apache.hadoop.dfs.StateChange: *DIR* NameNode.create: file
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025 at 208.76.44.142
2007-08-23 16:16:22,648 DEBUG org.apache.hadoop.dfs.StateChange: DIR* NameSystem.startFile:
file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025 at 208.76.44.142
2007-08-23 16:16:22,648 DEBUG org.apache.hadoop.dfs.StateChange: DIR* NameSystem.startFile:
add /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
to pendingCreates for DFSClient_1851346025
2007-08-23 16:16:22,653 DEBUG org.apache.hadoop.dfs.StateChange: DIR* FSDirectory.addFile:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
with 0 blocks is added to the file system
2007-08-23 16:16:24,254 DEBUG org.apache.hadoop.dfs.StateChange: *BLOCK* NameNode.addBlock:
file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025
2007-08-23 16:16:24,255 DEBUG org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.getAdditionalBlock:
file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025
2007-08-23 16:16:24,255 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data.
blk_-7923302926655854738 is created and added to pendingCreates and pendingCreateBlocks
2007-08-23 16:16:29,002 DEBUG org.apache.hadoop.dfs.StateChange: *DIR* NameNode.complete:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025
2007-08-23 16:16:29,984 DEBUG org.apache.hadoop.dfs.StateChange: DIR* NameSystem.completeFile:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025
2007-08-23 16:16:29,986 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call
complete(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data,
DFSClient_1851346025) from 208.76.44.142:36252: error: java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
java.io.IOException: Unknown file: /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data2007-08-23
16:16:37,428 DEBUG org.apache.hadoop.dfs.StateChange: *DIR* NameNode.create: file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025 at 208.76.44.142
2007-08-23 16:16:37,428 DEBUG org.apache.hadoop.dfs.StateChange: DIR* NameSystem.startFile:
file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025 at 208.76.44.1422007-08-23 16:16:37,428 WARN org.apache.hadoop.dfs.StateChange:
DIR* NameSystem.startFile: failed to create file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,UwSTZDqGkjRNQIYAPaNXRV==,5123329232858704969/repo/mapfiles/-1/data
for DFSClient_1851346025 on client 208.76.44.142 because current leaseholder is trying to
recreate file.
{code}

Only lines related to the failing file are listed.

> IOException on close: Unknown file
> ----------------------------------
>
>                 Key: HADOOP-1765
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1765
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: stack
>         Attachments: namenode-excerpt.log.gz
>
>
> On occasion I get an IOE trying to close a MapFile I've just opened and filled.  Below
is the content of the namenode log.  The first thing in it after startup messages is the IOE
followed by other messages about '...current leaseholder is trying to recreate file'.  It
repeats for ever.  This is TRUNK r567876.
> I filed this issue at Raghu's suggestion.  Here's what he said on the list:
> .bq Yes, the earler message was also from Namenode log. Before that line there should
have been a message that adds a block to this file in the same log. I think you should file
a Jira for this. It probably related to HADOOP-999. Raghu. 
> {code}
> 2007-08-22 21:38:49,071 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node:
/default-rack/208.76.44.139:50010
> 2007-08-22 21:43:50,216 INFO org.apache.hadoop.fs.FSNamesystem: Roll Edit Log from 208.76.44.139
> 2007-08-22 21:45:21,459 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000,
call complete(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data,
DFSClient_1857293290) from 208.76.44.139:52301: error: java.io.IOException: Unknown file:
/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
> java.io.IOException: Unknown file: /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
>     at org.apache.hadoop.dfs.FSDirectory.addBlocks(FSDirectory.java:561)
>     at org.apache.hadoop.dfs.FSNamesystem.completeFileInternal(FSNamesystem.java:1002)
>     at org.apache.hadoop.dfs.FSNamesystem.completeFile(FSNamesystem.java:952)
>     at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:348)
>     at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> 2007-08-22 21:45:24,394 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.startFile:
failed to create file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current leaseholder is trying to
recreate file.
> 2007-08-22 21:45:24,394 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000,
call create(/bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data,
DFSClient_1857293290, true, 3, 67108864) from 208.76.44.139:52312: error: org.apache.hadoop.dfs.AlreadyBeingCreatedException:
failed to create file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current leaseholder is trying to
recreate file.
> org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create file /bfd/hadoop-stack-data/tmp/hbase/compaction.tmp/hregion_hbaserepository,,8918388410463499185/repo/mapfiles/-1/data
for DFSClient_1857293290 on client 208.76.44.139 because current leaseholder is trying to
recreate file.
>     at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:740)
>     at org.apache.hadoop.dfs.NameNode.create(NameNode.java:307)
>     at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> {code}
> Let me try and replicate with DEBUG level enabled.  Also, I think I know how to easily
replicate.   Need to do some tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message