hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
Date Thu, 02 Mar 2017 00:35:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891363#comment-15891363
] 

Manoj Govindassamy edited comment on HDFS-11412 at 3/2/17 12:34 AM:
--------------------------------------------------------------------

[~mingma],

bq. Maybe we can modify getMinReplicationToBeInMaintenance to return the less of {file replication
factor, minReplicationToBeInMaintenance}

This sounds good to me. This will cover for the files whose block replication factor is less
than maintenance min, and will not trigger unnecessary re-replication.  {{BlockManager#getMinMaintenanceStorageNum()}}
is modified to return the min value.

{{BlockManager#getExpectedLiveRedundancyNum()}} is a common routine used for reconstruction
work apart from DecommissionManager. The current implementation of this routine looks good
to me.
* (A) In the context of general reconstruction needed for a block and when there is no maintenance
operations, the expected live redundancy for any block should be equal to its block replication
factor.
* (B) When the blocks are on maintenance nodes, then the expected live redundancy for the
block is the min of its block replication factor or maintenance min, that is BlockManager#getMinMaintenanceStorageNum()
* And, BlockManager#getExpectedLiveRedundancyNum() should be the Max(A, B) to work for both
non-maintenance and maintenance operations. If you set this to Min(A, B), getExpectedLiveRedundancyNum()
will end up as Min(A, Min(block_repl, maint_min) => which can become 0 whenever maintenance
min is 0 and can cause adverse affects. 

Can you please take a look at the latest patch and share your comments ?


was (Author: manojg):
[~mingma],

bq. Maybe we can modify getMinReplicationToBeInMaintenance to return the less of {file replication
factor, minReplicationToBeInMaintenance}

This sounds good to me. This will cover for the files whose block replication factor is less
than maintenance min, and will not trigger unnecessary re-replication.  {{BlockManager#getMinMaintenanceStorageNum()}}
is modified to return the min value.

{{BlockManager#getExpectedLiveRedundancyNum()}} is a common routine used for reconstruction
work apart from DecommissionManager. The current implementation of this routine looks good
to me.
-- (A) In the context of general reconstruction needed for a block and when there is no maintenance
operations, the expected live redundancy for any block should be equal to its block replication
factor.
-- (B) When the blocks are on maintenance nodes, then the expected live redundancy for the
block is the min of its block replication factor or maintenance min, that is BlockManager#getMinMaintenanceStorageNum()
-- And, BlockManager#getExpectedLiveRedundancyNum() should be the Max(A, B) to work for both
non-maintenance and maintenance operations. If you set this to Min(A, B), getExpectedLiveRedundancyNum()
will end up as Min(A, Min(block_repl, maint_min) => which can become 0 whenever maintenance
min is 0 and can cause adverse affects. 

Can you please take a look at the latest patch and share your comments ?

> Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11412
>                 URL: https://issues.apache.org/jira/browse/HDFS-11412
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11412.01.patch, HDFS-11412.02.patch
>
>
> Currently the allowed value range for Maintenance Min Replication {{dfs.namenode.maintenance.replication.min}}
is 0 to {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the performance
of the cluster would wish to have the Maintenance Min Replication number greater than 1, say
2. In the current design, it is possible to have this Maintenance Min Replication configuration,
but only after changing the NameNode level Block Min Replication to 2, and which could slowdown
the overall latency for client writes.
> Technically speaking we should be allowing Maintenance Min Replication to be in range
0 to dfs.replication.max.  
> * There is always config value of 0 for users not wanting any availability/performance
during maintenance. 
> * And, performance centric workloads can still get maintenance done without major disruptions
by having a bigger Maintenance Min Replication. Setting the upper limit as dfs.replication.max
could be an overkill as it could trigger re-replication which Maintenance State is trying
to avoid. So, we could allow the {{dfs.namenode.maintenance.replication.min}} in the range
{{0 to dfs.replication}}
> {noformat}
>     if (minMaintenanceR < 0) {
>       throw new IOException("Unexpected configuration parameters: "
>           + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>           + " = " + minMaintenanceR + " < 0");
>     }
>     if (minMaintenanceR > minR) {
>       throw new IOException("Unexpected configuration parameters: "
>           + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
>           + " = " + minMaintenanceR + " > "
>           + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
>           + " = " + minR);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message