hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8425) [umbrella] Performance tuning, investigation and optimization for erasure coding
Date Mon, 02 Nov 2015 09:16:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984928#comment-14984928
] 

Walter Su commented on HDFS-8425:
---------------------------------

Thanks [~tfukudom]! The results looks good.

I agree we should test read with some DN killed. But I'm afraid it won't be much different
in the TestDFSIO.

I've only tested writing. when I ran TestDFSIO, I found the throughput of ec is slightly better
than repl. It's the same as [~tfukudom]'s tests. I open disk monitor and network monitor.
The disk monitor shows that disk utilization often hits 100%. I think it's because we can
use all the cpus of NodeManagers, so the bottleneck is disk/network io. It's useful because
we can write ec files in batch. For example, converting multiple repl files to ec files.

The speed of single client writing is constrained by coding speed. Per local test, it's 2.5x
slower than repl. We need a faster codec. I think it's also important, right? But I'm not
sure there's use-case is bounded by the speed of single client writing. Usually we write files
using repl, and convert them to ec files later.

How do you think?

> [umbrella] Performance tuning, investigation and optimization for erasure coding
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-8425
>                 URL: https://issues.apache.org/jira/browse/HDFS-8425
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: GAO Rui
>         Attachments: testClientWriteReadFile_v1.pdf, testdfsio-read-mbsec.png, testdfsio-write-mbsec.png
>
>
> This {{umbrella}} jira aims to track performance tuning, investigation and optimization
for erasure coding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message