hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2050) distcp failed due to problem in creating files
Date Tue, 16 Oct 2007 08:26:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535191

Arun C Murthy commented on HADOOP-2050:

bq. After a mapper got killed due to failing to report progress, a new attempt may be scheduled
shortly, before the dfs lease hold on the destination file by the failed mapper got expired.
When the new attempt tries to create the destination file, an exception is thrown.

Essentially it the same issue we solved for speculative-tasks with HADOOP-1127. (http://wiki.apache.org/lucene-hadoop/FAQ#9)

Basically things should work if the ${mapred.output.dir} is set to "/" and the map-task should
write the file out to ${mapred.output.dir} which will be magically set to /_{taskid}, and
later files are promoted.

Clearly once we have dfs permissions this will not work, and then the way forward maybe to
run distcp as root. Thoughts?

> distcp failed due to problem in creating files
> ----------------------------------------------
>                 Key: HADOOP-2050
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2050
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Runping Qi
> When I run a distcp program to copy files from one dfs to another, my job failed with
> the mappers throwing the following exception:
> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.AlreadyBeingCreatedException:
failed to create file /xxxxx/part-00007 for DFSClient_task_200710122302_0002_m_000456_2 on
client because current leaseholder is trying to recreate file.
> 	at org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:850)
> 	at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:806)
> 	at org.apache.hadoop.dfs.NameNode.create(NameNode.java:333)
> 	at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> 	at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> 	at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> 	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1432)
> 	at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:376)
> 	at org.apache.hadoop.dfs.DistributedFileSystem.create(DistributedFileSystem.java:121)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:284)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:352)
> 	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:217)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:195)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1750)
> It seems that this problem happened  in the 2nd, 3rd, 4th attempts,
> after the first attemp failed.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message