hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2050) distcp failed due to problem in creating files
Date Mon, 15 Oct 2007 00:47:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534716
] 

Runping Qi commented on HADOOP-2050:
------------------------------------

It turned out to be a problem in CopyFile class.
After a mapper got killed  due to failing to report  progress,
a new attempt may be scheduled shortly, before the dfs lease hold
on the destination file by the failed mapper got expired. 
When the new attempt tries to create 
the destination file, an exception is thrown.

CopyFile should handle that exception and retry after sleeping for a short while.



> distcp failed due to problem in creating files
> ----------------------------------------------
>
>                 Key: HADOOP-2050
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2050
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Runping Qi
>
> When I run a distcp program to copy files from one dfs to another, my job failed with
> the mappers throwing the following exception:
> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.AlreadyBeingCreatedException:
failed to create file /xxxxx/part-00007 for DFSClient_task_200710122302_0002_m_000456_2 on
client 72.30.43.23 because current leaseholder is trying to recreate file.
> 	at org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:850)
> 	at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:806)
> 	at org.apache.hadoop.dfs.NameNode.create(NameNode.java:333)
> 	at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> 	at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> 	at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1432)
> 	at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:376)
> 	at org.apache.hadoop.dfs.DistributedFileSystem.create(DistributedFileSystem.java:121)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:284)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:352)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:217)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:195)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1750)
> It seems that this problem happened  in the 2nd, 3rd, 4th attempts,
> after the first attemp failed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message