hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool
Date Mon, 11 Sep 2017 23:17:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162208#comment-16162208
] 

Andrew Wang commented on HDFS-12412:
------------------------------------

While I was looking at this a little more, I noticed that "dfs.datanode.ec.reconstruction.stripedblock.threads.size"
is named poorly. Could you do another JIRA to rename this to drop the ".size" suffix, for
consistency with other similar config keys that control the size of a thread pool?

It'd also be good to add a release note to this JIRA to point users at what keys to use for
tuning reconstruction performance. If you want to update the EC user docs as part of this
JIRA too, that'd also be good.

> Remove ErasureCodingWorker.stripedReadPool
> ------------------------------------------
>
>                 Key: HDFS-12412
>                 URL: https://issues.apache.org/jira/browse/HDFS-12412
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-12412.00.patch
>
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule the EC
recovery tasks, while uses {{stripedReadPool}} for the reader threads in each recovery task.
 We only need one of them to throttle the speed of recovery process, because each EC recovery
task has a fix number of source readers (i.e., 3 for RS(3,2)). And because of the findings
in HDFS-12044, the speed of EC recovery can be throttled by {{strippedReconstructionPool}}
with {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand and calculate
the right balance between {{dfs.datanode.ec.reconstruction.stripedread.threads}}, {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}}
and {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} (comparing to which
{{reconstruction.threads.size}} implies), will unnecessarily limit the speed of recovery,
which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message