hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sagar Naik <sn...@attributor.com>
Subject Re: 0.18.1 datanode psuedo deadlock problem
Date Sat, 10 Jan 2009 17:59:05 GMT
Hi Raghu,


The periodic "du" and block reports thread thrash the disk. (Block 
Reports takes abt on an avg 21 mins )

and I think all the datanode threads are not able to do much and freeze

"org.apache.hadoop.dfs.DataNode$DataXceiver@f2127a" daemon prio=10 
tid=0x41f06000 nid=0x7c7c waiting for monitor entry [0x43918000..0x43918f50]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.hadoop.dfs.FSDataset.getFile(FSDataset.java:1158)
    - waiting to lock <0x54e550e0> (a org.apache.hadoop.dfs.FSDataset)
    at 
org.apache.hadoop.dfs.FSDataset.validateBlockFile(FSDataset.java:1074)
    at org.apache.hadoop.dfs.FSDataset.isValidBlock(FSDataset.java:1066)
    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:894)
    at 
org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:2322)
    at 
org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1187)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1045)
    at java.lang.Thread.run(Thread.java:619)

   Locked ownable synchronizers:
    - None

"org.apache.hadoop.dfs.DataNode$DataXceiver@1bcee17" daemon prio=10 
tid=0x4da8d000 nid=0x7ae4 waiting for monitor entry [0x459fe000..0x459ff0d0]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getNextVolume(FSDataset.java:473)
    - waiting to lock <0x551e8d48> (a 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:934)
    - locked <0x54e550e0> (a org.apache.hadoop.dfs.FSDataset)
    at 
org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:2322)
    at 
org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1187)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1045)
    at java.lang.Thread.run(Thread.java:619)

   Locked ownable synchronizers:
    - None

"DataNode: [/data/dfs-video-18/dfs/data]" daemon prio=10 tid=0x4d7ad400 
nid=0x7c40 runnable [0x4c698000..0x4c6990d0]
   java.lang.Thread.State: RUNNABLE
    at java.lang.String.lastIndexOf(String.java:1628)
    at java.io.File.getName(File.java:399)
    at 
org.apache.hadoop.dfs.FSDataset$FSDir.getGenerationStampFromFile(FSDataset.java:148)
    at 
org.apache.hadoop.dfs.FSDataset$FSDir.getBlockInfo(FSDataset.java:181)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolume.getBlockInfo(FSDataset.java:412)
    at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getBlockInfo(FSDataset.java:511)
    - locked <0x551e8d48> (a org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
    at org.apache.hadoop.dfs.FSDataset.getBlockReport(FSDataset.java:1053)
    at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:708)
    at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2890)
    at java.lang.Thread.run(Thread.java:619)


and the lock is *<0x54e550e0> *is held by another similar thread and tht 
thread is waiting on FSVolume blocked by getBlockReport()

Infact, during this time, the datanode appears as dead node and clients 
keep on getting createBlockException with timeout

We dont see this problem on other DNs with less number of blocks. So I 
think, 2Million files is an issue here

Pl correct me if I missed on something

-Sagar

Raghu Angadi wrote:
>
> The scan required for each block report is well known issue and it can 
> be fixed. It was discussed multiple times (e.g. 
> https://issues.apache.org/jira/browse/HADOOP-3232?focusedCommentId=12587795#action_12587795

> ).
>
> Earlier, inline 'du' on datanodes used to cause the same problem and 
> they they were moved to a separate thread (HADOOP-3232). block reports 
> can do the same...
>
> Though 2M blocks on DN is very large, there is no reason block reports 
> should break things. Once we fix block reports, something else might 
> break.. but that is different issue.
>
> Raghu.
>
> Jason Venner wrote:
>> The problem we are having is that datanodes periodically stall for 
>> 10-15 minutes and drop off the active list and then come back.
>>
>> What is going on is that a long operation set is holding the lock on 
>> on FSDataset.volumes, and all of the other block service requests 
>> stall behind this lock.
>>
>> "DataNode: [/data/dfs-video-18/dfs/data]" daemon prio=10 
>> tid=0x4d7ad400 nid=0x7c40 runnable [0x4c698000..0x4c6990d0]
>>   java.lang.Thread.State: RUNNABLE
>>    at java.lang.String.lastIndexOf(String.java:1628)
>>    at java.io.File.getName(File.java:399)
>>    at 
>> org.apache.hadoop.dfs.FSDataset$FSDir.getGenerationStampFromFile(FSDataset.java:148)

>>
>>    at 
>> org.apache.hadoop.dfs.FSDataset$FSDir.getBlockInfo(FSDataset.java:181)
>>    at 
>> org.apache.hadoop.dfs.FSDataset$FSVolume.getBlockInfo(FSDataset.java:412) 
>>
>>    at 
>> org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getBlockInfo(FSDataset.java:511) 
>>
>>    - locked <0x551e8d48> (a org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
>>    at 
>> org.apache.hadoop.dfs.FSDataset.getBlockReport(FSDataset.java:1053)
>>    at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:708)
>>    at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2890)
>>    at java.lang.Thread.run(Thread.java:619)
>>
>> This is basically taking a stat on every hdfs block on the datanode, 
>> which in our case is ~ 2million, and can take 10+ minutes (we may be 
>> experiencing problems with our raid controller but have no visibility 
>> into it) at the OS level the file system seems fine and operations 
>> eventually finish.
>>
>> It appears that a couple of different data structures are being 
>> locked with the single object FSDataset$Volume.
>>
>> Then this happens:
>> "org.apache.hadoop.dfs.DataNode$DataXceiver@1bcee17" daemon prio=10 
>> tid=0x4da8d000 nid=0x7ae4 waiting for monitor entry 
>> [0x459fe000..0x459ff0d0]
>>   java.lang.Thread.State: BLOCKED (on object monitor)
>>    at 
>> org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getNextVolume(FSDataset.java:473) 
>>
>>    - waiting to lock <0x551e8d48> (a 
>> org.apache.hadoop.dfs.FSDataset$FSVolumeSet)
>>    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:934)
>>    - locked <0x54e550e0> (a org.apache.hadoop.dfs.FSDataset)
>>    at 
>> org.apache.hadoop.dfs.DataNode$BlockReceiver.<init>(DataNode.java:2322)
>>    at 
>> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1187) 
>>
>>    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1045)
>>    at java.lang.Thread.run(Thread.java:619)
>>
>> which locks the FSDataset while waiting on the volume object
>>
>> and now all of the Datanode operations stall waiting on the FSDataset 
>> object.
>> ----------
>>
>> Our particular installation doesn't use multiple directories for 
>> hdfs, so a first simple hack for a local fix would be to modify 
>> getNextVolume to just return the single volume and not be synchronized
>>
>> A richer alternative would be to make the locking more fine grained 
>> on FSDataset$FSVolumeSet.
>>
>> Of course we are also trying to fix the file system performance and 
>> dfs block loading that results in the block report taking a long time.
>>
>> Any suggestions or warnings?
>>
>> Thanks.
>>
>>
>>
>


Mime
View raw message