ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aravindan Vijayan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-18191) "Restart all required" services operation failed at Metrics Collector since HDFS was not yet up
Date Thu, 18 Aug 2016 21:12:21 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aravindan Vijayan updated AMBARI-18191:
---------------------------------------
    Fix Version/s:     (was: 2.4.0)
                   2.5.0

> "Restart all required" services operation failed at Metrics Collector since HDFS was
not yet up
> -----------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-18191
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18191
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-metrics
>    Affects Versions: 2.4.0
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Blocker
>             Fix For: 2.5.0
>
>         Attachments: AMBARI-18191.patch
>
>
> ambari-server --hash
> 4017036da951a10f519a578de934308cf866ba50
> *Steps*
> # Deploy HDP-2.3.6 cluster with Ambari 2.2.2.0 (AMS is configured in distributed mode)
> # Upgrade Ambari to 2.4.0.0 and let it complete
> # Open Ambari web UI and hit "Restart all required" under Actions menu
> *Result*
> The operation fails while trying to restart Metrics Collector as it tried to make a WebHDFS
call while HDFS was not started:
> {code}
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_collector.py",
line 148, in <module>
>     AmsCollector().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 280, in execute
>     method(env)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 725, in restart
>     self.start(env)
>   File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_collector.py",
line 46, in start
>     self.configure(env, action = 'start') # for security
>   File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_collector.py",
line 41, in configure
>     hbase('master', action)
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/hbase.py",
line 213, in hbase
>     dfs_type=params.dfs_type
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
160, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
124, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 459, in action_create_on_execute
>     self.action_delayed("create")
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 456, in action_delayed
>     self.get_hdfs_resource_executor().action_delayed(action_name, self)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 256, in action_delayed
>     self._set_mode(self.target_status)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 363, in _set_mode
>     self.util.run_command(self.main_resource.resource.target, 'SETPERMISSION', method='PUT',
permission=self.mode, assertable_result=False)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 179, in run_command
>     _, out, err = get_user_call_output(cmd, user=self.run_user, logoutput=self.logoutput,
quiet=False)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py",
line 61, in get_user_call_output
>     raise Fail(err_msg)
> resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w '%{http_code}'
-X PUT --negotiate -u : 'http://vsharma-eu-mt-5.openstacklocal:50070/webhdfs/v1/user/ams/hbase?op=SETPERMISSION&user.name=hdfs&permission=775'
1>/tmp/tmp8twcZt 2>/tmp/tmpLPih9a' returned 7. curl: (7) couldn't connect to host
> 401
> {code}
> Afterwards, restarted HDFS individually first and then hit "Restart all Required" - the
operation was successful
> Looks like the issue is because the order of restart is incorrect across the hosts, hence
the dependent services don't come up upfront



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message