hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weiwei Yang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11740) Ozone: Differentiate time interval for different DatanodeStateMachine state tasks
Date Thu, 04 May 2017 15:06:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996510#comment-15996510
] 

Weiwei Yang edited comment on HDFS-11740 at 5/4/17 3:05 PM:
------------------------------------------------------------

Hi [~anu]

Thanks for your thoughtful comment, I appreciate it. Please see my answers below

Fixed Heartbeat - Pros:

bq. Simple to understand and write code. We are able to write good error messages like this...

This doesn't change. I tested on my cluster, it still shows same message as it is before.

bq. Fewer knobs to adjust – Since init, version and register are three states – we are
optimizing the first 90 seconds of a datanodes life. Since datanodes are very long running
processes, does this optimization matter?

I think it matters. There will be more states, if we let state transition sleeps a fixed interval
(which is now the interval for node heartbeat to SCM), it might slow down the actual work.
For example if in feature we want to support decommission a datanode from SCM, once it is
done, transited the state to decommissioned. The decommission may take sometime and client
is waiting on that, probably won't be happy if it needs to wait for more 30s until state changed.
Right now is a good timing because there isn't many states, easy to change.

bq. If that retry is happening, let us say one SCM is dead or network issue – we don't want
the scheduler to be running the next task immediately. We want some quite period since this
is an admin task – and we should not be consuming too much resources. I am worried that
RPC retry will happen till we time out and then due to this

This is true. If a task has some failure happened, I can set the interval to something else
and ask scheduler to schedule next task after this time. This can be done within current patch.
I will show that in v3 patch.

bq. if you want to support this feature – may I suggest that we make changes in DatanodeStates...

I tried this way, but it did not work out very well for me. The interval setting is better
to be in {{end point task}} level, because different tasks may require different interval
to run. Use {{ScheduledExecutorService}} as the executor service will help the state machine
to schedule tasks in required interval if necessary, much more convenient than {{sleep}}.
The behavior change is like

Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Sleep a fixed interval
# Back to 1 for next loop 

After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according to the task
interval
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Back to 1 for next loop

Please let me know your thoughts.

Thanks


was (Author: cheersyang):
Hi [~anu]

Thanks for your thoughtful comment, I appreciate it. Please see my answers below

Fixed Heartbeat - Pros:

bq. Simple to understand and write code. We are able to write good error messages like this...

This doesn't change. I tested on my cluster, it still shows same message as it is before.

bq. Fewer knobs to adjust – Since init, version and register are three states – we are
optimizing the first 90 seconds of a datanodes life. Since datanodes are very long running
processes, does this optimization matter?

I think it matters. There will be more states, if we let state transition sleeps a fixed interval
(which is now the interval for node heartbeat to SCM), it might slow down the actual work.
For example if in feature we want to support decommission a datanode from SCM, once it is
done, transited the state to decommissioned. The decommission may take sometime and client
is waiting on that, probably won't be happy if it needs to wait for more 30s until state changed.
Right now is a good timing because there isn't many states, easy to change.

bq. If that retry is happening, let us say one SCM is dead or network issue – we don't want
the scheduler to be running the next task immediately. We want some quite period since this
is an admin task – and we should not be consuming too much resources. I am worried that
RPC retry will happen till we time out and then due to this

This is true. If a task has some failure happened, I can set the interval to something else
and ask scheduler to schedule next task after this time. This can be done within current patch.
I will show that in v3 patch.

bq. if you want to support this feature – may I suggest that we make changes in DatanodeStates...

I tried this way, but it did not work out very well for me. The interval setting is better
to be in {{end point task}} level, because different tasks may require different interval
to run. Use {{ScheduledExecutorService}} as the executor service will help the state machine
to schedule tasks in required interval if necessary, much more convenient than {{sleep}}.
The behavior change is like

Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Sleep a fixed interval
# Back to 1 for next loop 

After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according to the task
interval
# Wait until the task returns, the result indicates the next state desired
# Back to 1 for next loop

Please let me know your thoughts.

Thanks

> Ozone: Differentiate time interval for different DatanodeStateMachine state tasks
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-11740
>                 URL: https://issues.apache.org/jira/browse/HDFS-11740
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: HDFS-11740-HDFS-7240.001.patch, HDFS-11740-HDFS-7240.002.patch,
statemachine_1.png, statemachine_2.png
>
>
> Currently datanode state machine transitioned between tasks in a fixed time interval,
defined by {{ScmConfigKeys#OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}}, the default value is 30s.
Once datanode is started, it will need 90s before transited to {{Heartbeat}} state, such a
long lag is not necessary. Propose to improve the logic of time interval handling, it seems
only the heartbeat task needs to be scheduled in {{OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}}
interval, rest should be done without any lagging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message