hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haohui Mai (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5276) FileSystem.Statistics got performance issue on multi-thread read/write.
Date Mon, 30 Sep 2013 23:51:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782443#comment-13782443
] 

Haohui Mai commented on HDFS-5276:
----------------------------------

And I'm still trying to understand the rational of addressing architectural problems in DFSClient.

Controlling issues such as cache alignments, synchronization from JVM are also essential to
avoid contentions. Since the information is simply unavailable to Java programs, in my personal
opinions the problem might be better addressed in the JVM, or even lower abstraction levels.

> FileSystem.Statistics got performance issue on multi-thread read/write.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-5276
>                 URL: https://issues.apache.org/jira/browse/HDFS-5276
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.4-alpha
>            Reporter: Chengxiang Li
>         Attachments: DisableFSReadWriteBytesStat.patch, HDFSStatisticTest.java, hdfs-test.PNG,
jstack-trace.PNG
>
>
> FileSystem.Statistics is a singleton variable for each FS scheme, each read/write on
HDFS would lead to a AutomicLong.getAndAdd(). AutomicLong does not perform well in multi-threads(let's
say more than 30 threads). so it may cause  serious performance issue. during our spark test
profile, 32 threads read data from HDFS, about 70% cpu time is spent on FileSystem.Statistics.incrementBytesRead().



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message