hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-1970) Null pointer exception comes when Namenode recovery happens and there is no response from client to NN more than the hardlimit for NN recovery and the current block is more than the prev block size in NN
Date Mon, 05 Sep 2011 17:40:09 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suresh Srinivas resolved HDFS-1970.
-----------------------------------

    Resolution: Duplicate

Duplicate of HDFS-1951 as indicated in previous comment.

> Null pointer exception comes when Namenode recovery happens and there is no response
from client to NN more than the hardlimit for NN recovery and the current block is more than
the prev block size in NN 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1970
>                 URL: https://issues.apache.org/jira/browse/HDFS-1970
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20-append
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 0.20-append
>
>
> Null pointer exception comes when Namenode recovery happens and there is no response
from client to NN more than the hardlimit for NN recovery and the current block is more than
the prev block size in NN 
> 1. Write using a client to 2 datanodes
> 2. Kill one data node and allow pipeline recovery.
> 3. write somemore data to the same block
> 4. Parallely allow the namenode recovery to happen
> Null pointer exception will come in addStoreBlock api.
> Pls find the logs
> Debugging in MachineName.. 
> Listening for transport dt_socket at address: 8007
> 11/05/20 21:38:33 INFO namenode.NameNode: STARTUP_MSG: 
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = linux76/10.18.52.76
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.3-SNAPSHOT
> STARTUP_MSG:   build =  -r ; compiled by 'G00900856' on Tue Feb  1 11:40:14 IST 2011
> ************************************************************/
> 11/05/20 21:38:33 INFO metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode,
port=9000
> 11/05/20 21:38:33 INFO namenode.NameNode: Namenode up at: linux76/10.18.52.76:9000
> 11/05/20 21:38:33 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode,
sessionId=null
> 11/05/20 21:38:33 INFO metrics.NameNodeMetrics: Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: fsOwner=root,root
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: supergroup=supergroup
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: isPermissionEnabled=false
> 11/05/20 21:38:33 INFO metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics
using context object:org.apache.hadoop.metrics.spi.NullContext
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: Registered FSNamesystemStatusMBean
> 11/05/20 21:38:33 INFO common.Storage: Number of files = 1
> 11/05/20 21:38:33 INFO common.Storage: Number of files under construction = 0
> 11/05/20 21:38:33 INFO common.Storage: Image file of size 94 loaded in 0 seconds.
> 11/05/20 21:38:33 INFO common.Storage: Edits file /home/ramkrishna/opensrchadoop/appendbranch/hadoop-0.20.3-SNAPSHOT/bin/../hadoop-root/dfs/name/current/edits
of size 4 edits # 0 loaded in 0 seconds.
> 11/05/20 21:38:33 INFO common.Storage: Image file of size 94 saved in 0 seconds.
> 11/05/20 21:38:34 INFO common.Storage: Image file of size 94 saved in 0 seconds.
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Finished loading FSImage in 482 msecs
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Total number of blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of invalid blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of under-replicated blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of  over-replicated blocks = 0
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Leaving safe mode after 0 secs.
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
> 11/05/20 21:38:34 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
via org.mortbay.log.Slf4jLog
> 11/05/20 21:38:35 INFO http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort()
before open() is -1. Opening the listener on 50070
> 11/05/20 21:38:35 INFO http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort()
returned 50070
> 11/05/20 21:38:35 INFO http.HttpServer: Jetty bound to port 50070
> 11/05/20 21:38:35 INFO mortbay.log: jetty-6.1.14
> 11/05/20 21:38:37 INFO mortbay.log: Started SelectChannelConnector@linux76:50070
> 11/05/20 21:38:37 INFO namenode.NameNode: Web-server up at: linux76:50070
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server Responder: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server listener on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 0 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 1 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 2 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 3 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 4 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 5 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 6 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 7 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 8 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 9 on 9000: starting
> 11/05/20 21:38:38 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration
from 10.18.52.76:50010 storage DS-1868335495-10.18.52.76-50010-1305907718956
> 11/05/20 21:38:38 INFO net.NetworkTopology: Adding a new node: /default-rack/10.18.52.76:50010
> 11/05/20 21:38:41 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration
from 10.18.52.5:50010 storage DS-721814011-10.18.52.5-50010-1305928120116
> 11/05/20 21:38:41 INFO net.NetworkTopology: Adding a new node: /default-rack/10.18.52.5:50010
> 11/05/20 21:38:54 INFO FSNamesystem.audit: ugi=R00902313,Tardis	ip=/10.18.47.133	cmd=create
src=/synctest0	dst=null	perm=R00902313:supergroup:rw-r--r--
> 11/05/20 21:38:54 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /synctest0.
blk_-8455334393673385140_1001
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
10.18.52.5:50010 is added to blk_-8455334393673385140_1001 size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
10.18.52.76:50010 is added to blk_-8455334393673385140_1001 size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /synctest0.
blk_8955752156241965528_1001
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
10.18.52.76:50010 is added to blk_8955752156241965528_1001 size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated:
10.18.52.5:50010 is added to blk_8955752156241965528_1001 size 1024
> 11/05/20 21:39:11 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /synctest0.
blk_-5249191180607448701_1001
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: commitBlockSynchronization(lastblock=blk_-5249191180607448701_1001,
newgenerationstamp=1002, newlength=833, newtargets=[10.18.52.76:50010], closeFile=false, deleteBlock=false)
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: Number of transactions: 4 Total time for
transactions(ms): 2Number of transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms):
28 
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: commitBlockSynchronization(blk_-5249191180607448701_1002)
successful
> 11/05/20 21:41:41 INFO namenode.LeaseManager: Lease [Lease.  Holder: DFSClient_-676883980,
pendingcreates: 1] has expired hard limit
> 11/05/20 21:41:56 INFO namenode.FSNamesystem: Recovering lease=[Lease.  Holder: DFSClient_-676883980,
pendingcreates: 1], src=/synctest0
> 11/05/20 21:41:56 INFO hdfs.StateChange: BLOCK* blk_-5249191180607448701_1002 recovery
started, primary=10.18.52.76:50010
> 11/05/20 21:42:09 INFO namenode.FSNamesystem: commitBlockSynchronization(lastblock=blk_-5249191180607448701_1002,
newgenerationstamp=1003, newlength=867, newtargets=[10.18.52.76:50010], closeFile=true, deleteBlock=false)
> 11/05/20 21:42:26 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: addStoredBlock
request received for blk_-5249191180607448701_1003 on 10.18.52.76:50010 size 867 and it belongs
to a file under construction. 
> 11/05/20 21:42:52 WARN namenode.FSNamesystem: Inconsistent size for block blk_-5249191180607448701_1003
reported from 10.18.52.76:50010 current size is 833 reported size is 867
> 11/05/20 21:43:11 WARN namenode.FSNamesystem: Block blk_-5249191180607448701_1003 reported
from 10.18.52.76:50010 does not exist in blockMap. Surprise! Surprise!
> 11/05/20 21:43:13 INFO hdfs.StateChange: Removing lease on  file /synctest0 from client
NN_Recovery
> 11/05/20 21:43:13 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to replicate blk_-5249191180607448701_1003
to datanode(s) 10.18.52.5:50010
> 11/05/20 21:43:13 INFO namenode.FSNamesystem: Number of transactions: 6 Total time for
transactions(ms): 2Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms):
30 
> 11/05/20 21:43:13 INFO namenode.FSNamesystem: commitBlockSynchronization(newblock=blk_-5249191180607448701_1003,
file=/synctest0, newgenerationstamp=1003, newlength=867, newtargets=[10.18.52.76:50010]) successful
> 11/05/20 21:43:15 INFO ipc.Server: IPC Server handler 6 on 9000, call blockReceived(DatanodeRegistration(10.18.52.76:50010,
storageID=DS-1868335495-10.18.52.76-50010-1305907718956, infoPort=50075, ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@9b774e,
[Ljava.lang.String;@b5d05b) from 10.18.52.76:40343: error: java.io.IOException: java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addStoredBlock(FSNamesystem.java:3173)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:3592)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:756)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
> 11/05/20 21:43:19 WARN hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: Redundant
addStoredBlock request received for blk_-5249191180607448701_1003 on 10.18.52.76:50010 size
867
> 11/05/20 21:43:19 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to replicate blk_-5249191180607448701_1003
to datanode(s) 10.18.52.5:50010
> Listening for transport dt_socket at address: 8007
> 11/05/20 21:43:22 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to replicate blk_-5249191180607448701_1003
to datanode(s) 10.18.52.5:50010
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message