hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Judd (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers
Date Wed, 28 Jan 2009 00:15:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667881#action_12667881
] 

Doug Judd commented on HADOOP-4379:
-----------------------------------

Now when I apply your latest patch and the one from 5027 to the 0.19.0 source, the datanodes
seem to be going into an infinite loop of NullPointerExceptions.  At the top of the hadoop-zvents-datanode-motherlode007.admin.zvents.com.log
I'm seeing this:

[doug@motherlode007 logs]$ more hadoop-zvents-datanode-motherlode007.admin.zvents.com.log

2009-01-27 15:32:55,828 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:

/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = motherlode007.admin.zvents.com/10.0.30.114
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.19.1-dev
STARTUP_MSG:   build =  -r ; compiled by 'doug' on Tue Jan 27 15:04:06 PST 2009
************************************************************/
2009-01-27 15:32:57,041 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: motherlode000/10.0.30.100:9000.
Already tried 0 time(s).
2009-01-27 15:33:00,505 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
2009-01-27 15:33:00,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info
server at 50010
2009-01-27 15:33:00,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith
is 1048576 bytes/s
2009-01-27 15:33:00,783 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4
2009-01-27 15:33:00,792 INFO org.mortbay.util.Credential: Checking Resource aliases
2009-01-27 15:33:01,839 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@319c0bd6
2009-01-27 15:33:01,878 INFO org.mortbay.util.Container: Started WebApplicationContext[/static,/static]
2009-01-27 15:33:02,048 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@5a943dc4
2009-01-27 15:33:02,049 INFO org.mortbay.util.Container: Started WebApplicationContext[/logs,/logs]
2009-01-27 15:33:02,754 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@6d581e80
2009-01-27 15:33:02,760 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/]
2009-01-27 15:33:02,763 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50075
2009-01-27 15:33:02,764 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@5c435a3a
2009-01-27 15:33:02,769 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
with processName=DataNode, sessionId=null
2009-01-27 15:33:02,825 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
with hostName=DataNode, port=50020
2009-01-27 15:33:02,831 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2009-01-27 15:33:02,834 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020:
starting
2009-01-27 15:33:02,834 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2009-01-27 15:33:02,836 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration
= DatanodeRegistration(motherlode007.admin.zvents.com:50010, storageID
=DS-745224472-10.0.30.114-50010-1230665635246, infoPort=50075, ipcPort=50020)
2009-01-27 15:33:02,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020:
starting
2009-01-27 15:33:02,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020:
starting
2009-01-27 15:33:02,839 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.0.30.114:50010,
storageID=DS-745224472-10.0.30.114-50010-1230
665635246, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/data1/hadoop/dfs/data/current,/data2/hadoop/dfs/data/current,/data3/hadoop/dfs
/data/current'}
2009-01-27 15:33:02,840 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL
of 3600000msec Initial delay: 0msec
2009-01-27 15:33:02,932 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: java.lang.NullPointerException
	at org.apache.hadoop.hdfs.server.namenode.DatanodeDescriptor.reportDiff(DatanodeDescriptor.java:396)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:2803)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:636)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

	at org.apache.hadoop.ipc.Client.call(Client.java:696)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
	at $Proxy4.blockReport(Unknown Source)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:723)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1100)
	at java.lang.Thread.run(Thread.java:619)

2009-01-27 15:33:02,953 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: java.lang.NullPointerExce
ption
	at org.apache.hadoop.hdfs.server.namenode.DatanodeDescriptor.reportDiff(DatanodeDescriptor.java:396)
[...]


And the file is growing rapidly with the following exceptions tacked on to the end:

2009-01-27 16:10:55,973 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: java.lang.NullPointerException

	at org.apache.hadoop.ipc.Client.call(Client.java:696)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
	at $Proxy4.blockReport(Unknown Source)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:723)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1100)
	at java.lang.Thread.run(Thread.java:619)


It appears that these exceptions started happening within about 5 seconds after startup, so
it doesn't look like it has anything to do with Hypertable.  Is it ok to apply these patches
to the 0.19.0 source?  Or should I be applying them to the trunk?

- Doug


> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, fsyncConcurrentReaders3.patch,
fsyncConcurrentReaders4.patch, Reader.java, Reader.java, Writer.java, Writer.java
>
>
> In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc),
it says
> * A reader is guaranteed to be able to read data that was 'flushed' before the reader
opened the file
> However, this feature is not yet implemented.  Note that the operation 'flushed' is now
called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message