hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <ja...@attributor.com>
Subject 0.18.1 datanode psuedo deadlock problem
Date Fri, 09 Jan 2009 00:03:50 GMT
The problem we are having is that datanodes periodically stall for 10-15 
minutes and drop off the active list and then come back.

What is going on is that a long operation set is holding the lock on on 
FSDataset.volumes, and all of the other block service requests stall 
behind this lock.

"DataNode: [/data/dfs-video-18/dfs/data]" daemon prio=10 tid=0x4d7ad400 
nid=0x7c40 runnable [0x4c698000..0x4c6990d0]
   java.lang.Thread.State: RUNNABLE
    at java.lang.String.lastIndexOf(String.java:1628)
    at java.io.File.getName(File.java:399)
    at 
org.apache.hadoop.dfs.FSDataset$FSDir.getGenerationStampFromFile(FSDataset.java:148)
    at 
org.apache.hadoop.dfs.FSDataset$FSDir.getBlockInfo(FSDataset.java:181)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolume.getBlockInfo(FSDataset.java:412)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getBlockInfo(FSDataset.java:511)
    - locked <0x551e8d48> (a org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
    at org.apache.hadoop.dfs.FSDataset.getBlockReport(FSDataset.java:1053)
    at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:708)
    at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2890)
    at java.lang.Thread.run(Thread.java:619)

This is basically taking a stat on every hdfs block on the datanode, 
which in our case is ~ 2million, and can take 10+ minutes (we may be 
experiencing problems with our raid controller but have no visibility 
into it) at the OS level the file system seems fine and operations 
eventually finish.

It appears that a couple of different data structures are being locked 
with the single object FSDataset$Volume.

Then this happens:
"org.apache.hadoop.dfs.DataNode$DataXceiver@1bcee17" daemon prio=10 
tid=0x4da8d000 nid=0x7ae4 waiting for monitor entry [0x459fe000..0x459ff0d0]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getNextVolume(FSDataset.java:473)
    - waiting to lock <0x551e8d48> (a 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:934)
    - locked <0x54e550e0> (a org.apache.hadoop.dfs.FSDataset)
    at 
org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:2322)
    at 
org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1187)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1045)
    at java.lang.Thread.run(Thread.java:619)

which locks the FSDataset while waiting on the volume object

and now all of the Datanode operations stall waiting on the FSDataset 
object.
----------

Our particular installation doesn't use multiple directories for hdfs, 
so a first simple hack for a local fix would be to modify getNextVolume 
to just return the single volume and not be synchronized

A richer alternative would be to make the locking more fine grained on 
FSDataset$FSVolumeSet.

Of course we are also trying to fix the file system performance and dfs 
block loading that results in the block report taking a long time.

Any suggestions or warnings?

Thanks.




Mime
View raw message