Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 11 Nov 2016 01:10:58 +0000 (UTC)
From: "Arpit Agarwal (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12932641.1453275643000.249986.1478826658636@Atlassian.JIRA>
In-Reply-To: <JIRA.12932641.1453275643000@Atlassian.JIRA>
References: <JIRA.12932641.1453275643000@Atlassian.JIRA> <JIRA.12932641.1453275643415@arcas>
Subject: [jira] [Commented] (HDFS-9668) Optimize the locking in
 FsDatasetImpl
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 11 Nov 2016 01:11:00 -0000


    [ https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15655727#comment-15655727 ] 

Arpit Agarwal commented on HDFS-9668:
-------------------------------------

Hi [~jingcheng.du@intel.com] I started taking a look at [the patch|https://issues.apache.org/jira/secure/attachment/12837702/HDFS-9668-23.patch]. A few early comments:
# Can we separate the block-lock addition from the read-write lock? In the first patch we should focus on converting the exclusive lock to a read-write lock only. I can help split up the patch if you'd like.
# DirectoryScanner.java:391 - We can just get the read lock here. This phase of the directory scanner makes no changes to the DataNode state.
# FsDatasetImpl.java:1117 - This should get the write lock as it calls appendImpl which modifies the volumeMap.
# FsDatasetImpl.java:1268 - This too calls appendImpl, so it should get the write lock.

Still reviewing FsDatasetImpl further... Also you can probably hold off on the branch-2 patch until we finalize the trunk changes.

> Optimize the locking in FsDatasetImpl
> -------------------------------------
>
>                 Key: HDFS-9668
>                 URL: https://issues.apache.org/jira/browse/HDFS-9668
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HDFS-9668-1.patch, HDFS-9668-10.patch, HDFS-9668-11.patch, HDFS-9668-12.patch, HDFS-9668-13.patch, HDFS-9668-14.patch, HDFS-9668-14.patch, HDFS-9668-15.patch, HDFS-9668-16.patch, HDFS-9668-17.patch, HDFS-9668-18.patch, HDFS-9668-19.patch, HDFS-9668-19.patch, HDFS-9668-2.patch, HDFS-9668-20.patch, HDFS-9668-21.patch, HDFS-9668-22.patch, HDFS-9668-23.patch, HDFS-9668-23.patch, HDFS-9668-3.patch, HDFS-9668-4.patch, HDFS-9668-5.patch, HDFS-9668-6.patch, HDFS-9668-7.patch, HDFS-9668-8.patch, HDFS-9668-9.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in SSD/RAMDISK, and all other files are stored in HDD), we observe many long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at /192.168.50.16:48521 [Receiving block BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread t@93336
>    java.lang.Thread.State: BLOCKED
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1111)
> 	- waiting to lock <18324c9> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at /192.168.50.16:48520 [Receiving block BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:183)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
> 	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> 	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
> 	at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> 	- None
> 	
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at /192.168.50.16:48520 [Receiving block BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread t@93335
>    java.lang.Thread.State: RUNNABLE
> 	at java.io.UnixFileSystem.createFileExclusively(Native Method)
> 	at java.io.File.createNewFile(File.java:1012)
> 	at org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
> 	- locked <18324c9> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> 	at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
> 	at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:183)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
> 	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> 	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> 	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
> 	at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> 	- None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a slow storage can block all the other same operations in the same DataNode, especially in HBase when many wal/flusher/compactor are configured.
> We need a finer grained lock mechanism in a new FsDatasetImpl implementation and users can choose the implementation by configuring "dfs.datanode.fsdataset.factory" in DataNode.
> We can implement the lock by either storage level or block-level.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org