hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-10885) [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on
Date Tue, 01 Nov 2016 05:06:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624265#comment-15624265
] 

Rakesh R edited comment on HDFS-10885 at 11/1/16 5:06 AM:
----------------------------------------------------------

Thanks [~zhouwei] for the patch. I've few comments, please take a look at this.

# IIUC, presently StoragePolicySatisfier daemon is deleting the path if another(Mover) is
already running and continue starting the SPS. I think, instead of this it should respect
another already running instance. To do this, SPS could use appending logic similar to {{NameNodeConnector#checkAndMarkRunning()}},
probably we could try to reuse this code by moving this to a utility or some other way, if
possible.
# I could see a {{while (true)}} loop in StoragePolicySatisfier#run(). Do you meant to iterate
many times to see Mover's completion and then proceeds to create the mover path in SPS and
starts automatically? If yes, there could be a use case where admin wants to run Mover tool
multiple times, but now immediately after the first Mover finishes SPS will automatically
acquire the lock by creating the path and starts. Now, the admin losts the chance to run Mover
again, right? 
(1)One idea is just throw an error and fail SPS startup OR (2) other idea is to pause SPS
until admin explcitly resume SPS. Later we could provide a facility to dynamically pause/resume
SPS daemon so that admin can take care of this service. Like I mentioned in my previous comment
in this jira, I'm OK to implement dynamic pause/resume via another jira.
[~umamaheswararao], [~drankye] Welcome any thoughts. Thanks!
# How about keeping the {{MOVER_ID_PATH}} in {{HdfsServerConstants.java}} file rather than
duplicating?
{code}
     public static final Path MOVER_ID_PATH = new Path("/system/mover.id");
{code}
# In StoragePolicySatisfier.java, please make {{private OutputStream out;}} this inline like
below,
{code}
     OutputStream out = createMarkRunning();
{code}
# Can we avoid nested if?
{code}
+    if (storagePolicyEnabled) {
+      if (spsEnabled) {
{code}
Insead of nested, how about like this,
{code}
    if (storagePolicyEnabled && spsEnabled) {
        sps = new StoragePolicySatisfier(namesystem,
            storageMovementNeeded, this, conf);
        LOG.warn(
            "Failed to start StoragePolicySatisfier"
                + " since {} set to {} and {} set to {}.",
            DFS_STORAGE_POLICY_ENABLED_KEY, storagePolicyEnabled,
            DFS_NAMENODE_SPS_ENABLED_KEY, spsEnabled);
    }
{code}
# It seems the test case failures are related. I'm adding few cases, please go through all
the failures:
{{TestStorageMover.testMigrateFileToArchival}}, {{TestHdfsConfigFields.testCompareConfigurationClassAgainstXml}}
etc. Also, please take care checkstyle warnings as well. Thanks!
# Just a suggestion to give the err message as {{StoragePolicySatisfier daemon is already
running}} instead of {{SPS is already running}} to make it clear to the users.


was (Author: rakeshr):
Thanks [~zhouwei] for the patch. I've few comments, please take a look at this.

# IIUC, presently StoragePolicySatisfier daemon is deleting the path if another(Mover) is
already running and continue starting the SPS. I think, instead of this it should respect
another already running instance. To do this, SPS could use appending logic similar to {{NameNodeConnector#checkAndMarkRunning()}},
probably we could try to reuse this code by moving this to a utility or some other way, if
possible.
# I could see a {{while (true)}} loop in StoragePolicySatisfier#run(). Do you meant to iterate
many times to see Mover's completion and then proceeds to create the mover path in SPS automatically?
If yes, there could be a use case where admin wants to run Mover tool multiple times, but
now immediately after the first Mover finishes SPS will automatically acquire the lock by
creating the path and starts. Now, the admin losts the chance to run Mover again, right? IMHO,
instead of this, just throw an error and fail SPS startup. Later we could provide a facility
to dynamically start/stop SPS daemon so that admin can take care of this service. Like I mentioned
in my previous comment in this jira, I'm OK to implement dynamic start/stop via another jira.
# How about keeping the {{MOVER_ID_PATH}} in {{HdfsServerConstants.java}} file rather than
duplicating?
{code}
     public static final Path MOVER_ID_PATH = new Path("/system/mover.id");
{code}
# In StoragePolicySatisfier.java, please make {{private OutputStream out;}} this inline like
below,
{code}
     OutputStream out = createMarkRunning();
{code}
# Can we avoid nested if?
{code}
+    if (storagePolicyEnabled) {
+      if (spsEnabled) {
{code}
Insead of nested, how about like this,
{code}
    if (storagePolicyEnabled && spsEnabled) {
        sps = new StoragePolicySatisfier(namesystem,
            storageMovementNeeded, this, conf);
        LOG.warn(
            "Failed to start StoragePolicySatisfier"
                + " since {} set to {} and {} set to {}.",
            DFS_STORAGE_POLICY_ENABLED_KEY, storagePolicyEnabled,
            DFS_NAMENODE_SPS_ENABLED_KEY, spsEnabled);
    }
{code}
# It seems the test case failures are related. I'm adding few cases, please go through all
the failures:
{{TestStorageMover.testMigrateFileToArchival}}, {{TestHdfsConfigFields.testCompareConfigurationClassAgainstXml}}
etc. Also, please take care checkstyle warnings as well. Thanks!
# Just a suggestion to give the err message as {{StoragePolicySatisfier daemon is already
running}} instead of {{SPS is already running}} to make it clear to the users.

> [SPS]: Mover tool should not be allowed to run when Storage Policy Satisfier is on
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-10885
>                 URL: https://issues.apache.org/jira/browse/HDFS-10885
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-10285
>            Reporter: Wei Zhou
>            Assignee: Wei Zhou
>             Fix For: HDFS-10285
>
>         Attachments: HDFS-10800-HDFS-10885-00.patch, HDFS-10800-HDFS-10885-01.patch,
HDFS-10800-HDFS-10885-02.patch, HDFS-10885-HDFS-10285.03.patch, HDFS-10885-HDFS-10285.04.patch,
HDFS-10885-HDFS-10285.05.patch
>
>
> These two can not work at the same time to avoid conflicts and fight with each other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message