hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Chansler (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-167) DFSClient continues to retry indefinitely
Date Fri, 09 Oct 2009 18:23:31 GMT

     [ https://issues.apache.org/jira/browse/HDFS-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Chansler updated HDFS-167:
---------------------------------

     Description: 
I encountered a bug when trying to upload data using the Hadoop DFS Client.  
After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload
up to some limited number of times.  In this case, I found that this retry loop continued
indefinitely, to the point that the number of tries remaining was negative:
2009-03-25 16:20:02 [INFO] 
2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication
for 21 seconds
2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException
sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009
0325_us/logs_20090325_us_13 retries left -1


The stack trace for the failure that's retrying is:
2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.NotReplicated
YetException: Not replicated yet:<filename>
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
2009-03-25 16:20:02 [INFO] 
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Client.call(Client.java:697)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

Fixes logical error in DFSClient::DFSOutputStream::DataStreamer::locateFollowingBlock that
caused infinite retries on write. Modified DFSClient constructor to allow unit testing of
locateFollowingBlock and added unit tests. 


  was:
I encountered a bug when trying to upload data using the Hadoop DFS Client.  
After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload
up to some limited number of times.  In this case, I found that this retry loop continued
indefinitely, to the point that the number of tries remaining was negative:
2009-03-25 16:20:02 [INFO] 
2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication
for 21 seconds
2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException
sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009
0325_us/logs_20090325_us_13 retries left -1


The stack trace for the failure that's retrying is:
2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.NotReplicated
YetException: Not replicated yet:<filename>
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
2009-03-25 16:20:02 [INFO] 
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Client.call(Client.java:697)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)


    Release Note:   (was: Fixes logical error in DFSClient::DFSOutputStream::DataStreamer::locateFollowingBlock
that caused infinite retries on write. Modified DFSClient constructor to allow unit testing
of locateFollowingBlock and added unit tests. )

Editorial pass over all release notes prior to publication of 0.21.

> DFSClient continues to retry indefinitely
> -----------------------------------------
>
>                 Key: HDFS-167
>                 URL: https://issues.apache.org/jira/browse/HDFS-167
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>            Reporter: Derek Wollenstein
>            Assignee: Bill Zeller
>            Priority: Minor
>             Fix For: 0.20.1, 0.21.0
>
>         Attachments: hdfs-167-4.patch, hdfs-167-5.patch, hdfs-167-6.patch, hdfs-167-for-20-1.patch
>
>
> I encountered a bug when trying to upload data using the Hadoop DFS Client.  
> After receiving a NotReplicatedYetException, the DFSClient will normally retry its upload
up to some limited number of times.  In this case, I found that this retry loop continued
indefinitely, to the point that the number of tries remaining was negative:
> 2009-03-25 16:20:02 [INFO] 
> 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: Waiting for replication
for 21 seconds
> 2009-03-25 16:20:03 [INFO] 09/03/25 16:20:02 WARN hdfs.DFSClient: NotReplicatedYetException
sleeping /apollo/env/SummaryMySQL/var/logstore/fiorello_logs_2009
> 0325_us/logs_20090325_us_13 retries left -1
> The stack trace for the failure that's retrying is:
> 2009-03-25 16:20:02 [INFO] 09/03/25 16:20:02 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.NotReplicated
> YetException: Not replicated yet:<filename>
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
> 2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown
Source)
> 2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
> 2009-03-25 16:20:02 [INFO] 
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.Client.call(Client.java:697)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> 2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
> 2009-03-25 16:20:02 [INFO]      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown
Source)
> 2009-03-25 16:20:02 [INFO]      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 2009-03-25 16:20:02 [INFO]      at java.lang.reflect.Method.invoke(Method.java:597)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> 2009-03-25 16:20:02 [INFO]      at $Proxy0.addBlock(Unknown Source)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
> 2009-03-25 16:20:02 [INFO]      at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
> Fixes logical error in DFSClient::DFSOutputStream::DataStreamer::locateFollowingBlock
that caused infinite retries on write. Modified DFSClient constructor to allow unit testing
of locateFollowingBlock and added unit tests. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message