hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8968) Erasure coding: a comprehensive I/O throughput benchmark tool
Date Fri, 13 Nov 2015 11:43:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15003886#comment-15003886

Rui Li commented on HDFS-8968:

Thanks Zhe for the review and good suggestions. I just filed HDFS-9424 as the follow on task.

Regarding the code optimizations:
2. I think the number of required DNs is related to the EC policy we use, right? Currently
the tool just uses the default policy. We can make this configurable so that user can choose
a policy to test.
3. Yeah I can clear the client's cache. However, I don't think this is necessary because ideally,
no disk I/O is involved on the client side. So clearing client's cache won't make much difference
(I suppose you're referring the cache for disk, let me know if otherwise). The tool only runs
the benchmark once instead of multiple rounds. I think it's better and easier to leave it
to the user to take care of the cache.

> Erasure coding: a comprehensive I/O throughput benchmark tool
> -------------------------------------------------------------
>                 Key: HDFS-8968
>                 URL: https://issues.apache.org/jira/browse/HDFS-8968
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding, test
>    Affects Versions: 3.0.0
>            Reporter: Kai Zheng
>            Assignee: Rui Li
>             Fix For: 3.0.0
>         Attachments: HDFS-8968-HDFS-7285.1.patch, HDFS-8968-HDFS-7285.2.patch, HDFS-8968.3.patch,
HDFS-8968.4.patch, HDFS-8968.5.patch
> We need a new benchmark tool to measure the throughput of client writing and reading
considering cases or factors:
> * 3-replica or striping;
> * write or read, stateful read or positional read;
> * which erasure coder;
> * striping cell size;
> * concurrent readers/writers using processes or threads.
> The tool should be easy to use and better to avoid unnecessary local environment impact,
like local disk.

This message was sent by Atlassian JIRA

View raw message