hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andras Bokor (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode
Date Fri, 21 Apr 2017 11:57:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978504#comment-15978504
] 

Andras Bokor edited comment on HADOOP-13238 at 4/21/17 11:56 AM:
-----------------------------------------------------------------

[~aw]

The root cause here is that the JSVC will delete its own pid file which was passed with {{-pidfile}}
option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop.
That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is
enough to check that whether the pid file exists or not. If not that means JSVC deleted the
file so we do not need to do check and delete.
In addition the error message shows up twice because either {{hadoop_stop_daemon}} or {{hadoop_stop_secure_daemon}}
do the same check and deletes the same pid file. The second one can be removed from the code.

After my patch the test still passes. {{hadoop_stop_daemon.bats}} and {{hadoop_stop_secure_daemon.bats}}
do the same test so the first one seems unnecessary.
Also, I added a new test to prove that the pid file is deleted when everything went well.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon_when_pid_file_changes
 ✓ hadoop_stop_secure_daemon_deletes_pid_file

2 tests, 0 failures{code}

Output after patch:
{code}root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/start-dfs.sh
Starting namenodes on [abokor-practice-2.openstacklocal]
Starting datanodes
Starting secondary namenodes [abokor-practice-5]
root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/stop-dfs.sh
Stopping namenodes on [abokor-practice-2.openstacklocal]
Stopping datanodes
Stopping secondary namenodes [abokor-practice-5]{code}


was (Author: boky01):
[~aw]

The root cause here is that JSVC will delete the pid file which was passed to it with {{-pidfile}}
option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop.
That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is
enough to check that whether the pid file exists or not. If not that means JSVC deleted the
file so we do not need to do check and delete.
In addition the error message shows up twice because either {{hadoop_stop_daemon.bats}} or
{{hadoop_stop_secure_daemon.bats}} do the same check and deletes the same pid file. The second
one can be removed from the code.

After my patch the test still passes. {{adoop_stop_daemon.bats}} and {{adoop_stop_secure_daemon.bats}}
do the same test so the first seems unnecessary.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon

1 test, 0 failures{code}

> pid handling is failing on secure datanode
> ------------------------------------------
>
>                 Key: HADOOP-13238
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13238
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: scripts, security
>            Reporter: Allen Wittenauer
>            Assignee: Andras Bokor
>
> {code}
> hdfs --daemon stop datanode
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or directory
> WARNING: pid has changed for datanode, skip deleting pid file
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or directory
> WARNING: daemon pid has changed for datanode, skip deleting daemon pid file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message