hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool
Date Sat, 09 Sep 2017 02:35:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16159682#comment-16159682
] 

Kai Zheng commented on HDFS-12412:
----------------------------------

Thanks Eddy for the ping. 

The idea to remove the striped read pool and reuse the same reconstruction pool sounds good
to me, since given the later and the most often used erasure codec, we can roughly estimate
the striped read threads need. We can also simplify the configuration and codes.

So as you said, you probably have the idea how to reduce the recommended value or default
value and validate the configuration value for the reconstruction pool size, assuming you
know how many concurrent reconstruction tasks to be performed and so on.

Less configuration with reasonable defaults would make the brand feature more easier to use.
When needed, we can fine-tune and add more later.

> Remove ErasureCodingWorker.stripedReadPool
> ------------------------------------------
>
>                 Key: HDFS-12412
>                 URL: https://issues.apache.org/jira/browse/HDFS-12412
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>
> In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule the EC
recovery tasks, while uses {{stripedReadPool}} for the reader threads in each recovery task.
 We only need one of them to throttle the speed of recovery process, because each EC recovery
task has a fix number of source readers (i.e., 3 for RS(3,2)). And because of the findings
in HDFS-12044, the speed of EC recovery can be throttled by {{strippedReconstructionPool}}
with {{xmitsInProgress}}. 
> Moreover, keeping {{stripedReadPool}} makes customer difficult to understand and calculate
the right balance between {{dfs.datanode.ec.reconstruction.stripedread.threads}}, {{dfs.datanode.ec.reconstruction.stripedblock.threads.size}}
and {{maxReplicationStreams}}.  For example, a small {{stripread.threads}} (comparing to which
{{reconstruction.threads.size}} implies), will unnecessarily limit the speed of recovery,
which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message