hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14277) [SBN read] Observer benchmark results
Date Thu, 29 Aug 2019 18:50:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918873#comment-16918873
] 

Jonathan Hung commented on HDFS-14277:
--------------------------------------

Hi [~xkrogen]/[~vagarychen], since this is marked as a 2.10 blocker, what's our plan to address
these performance concerns?

Do we need to file separate tickets for the action items Erik mentioned? Or should we address
all of them as part of this jira?

> [SBN read] Observer benchmark results
> -------------------------------------
>
>                 Key: HDFS-14277
>                 URL: https://issues.apache.org/jira/browse/HDFS-14277
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: ha, namenode
>    Affects Versions: 2.10.0, 3.3.0
>         Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 2.5Ghz, 25GB
memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, RPC encryption
+ Data Transfer Encryption, Cloudera Navigator.
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Blocker
>              Labels: release-blocker
>         Attachments: Observer profiler.png, Screen Shot 2019-02-14 at 11.50.37 AM.png,
observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled cluster. Would
like to share the results with the community. The cluster has 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs <namenode>
 -op fileStatus -threads 100 -files 1000000 -useExisting -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU utilization even if
the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 5ms
RPC processing time. I am still looking for the source of the longer processing time. The
longer RPC processing time may be the cause for the performance degradation compared to that
of Active NN. Note the cluster has Cloudera Navigator installed which adds additional overhead
to RPC processing time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top hotspots in
the profiler. 
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message