ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 36645: Unable to Start NameNode in HA Mode On HDP 2.0
Date Wed, 22 Jul 2015 12:38:39 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36645/#review92588
-----------------------------------------------------------

Ship it!


Ship It!

- Jonathan Hurley


On July 21, 2015, 1:12 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/36645/
> -----------------------------------------------------------
> 
> (Updated July 21, 2015, 1:12 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley and Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-12374
>     https://issues.apache.org/jira/browse/AMBARI-12374
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When starting an HA NameNode cluster on HDP 2.0, the following error is seen:
> 
> {code}
> 2015-07-07 16:02:56,371 - Getting jmx metrics from NN failed. URL: http://c6401.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
>     return data_dict["beans"][0][property]
> IndexError: list index out of range
> 2015-07-07 16:02:56,396 - Getting jmx metrics from NN failed. URL: http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
>     return data_dict["beans"][0][property]
> IndexError: list index out of range
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 316, in <module>
>     NameNode().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 216, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 81, in start
>     namenode(action="start", rolling_restart=rolling_restart, env=env)
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py",
line 141, in namenode
>     create_hdfs_directories(is_active_namenode_cmd)
>   File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py",
line 198, in create_hdfs_directories
>     only_if=check
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 390, in action_create_on_execute
>     self.action_delayed("create")
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 387, in action_delayed
>     self.get_hdfs_resource_executor().action_delayed(action_name, self)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 239, in action_delayed
>     main_resource.resource.security_enabled, main_resource.resource.logoutput)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 126, in __init__
>     security_enabled, run_user)
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/namenode_ha_utils.py",
line 113, in get_property_for_active_namenode
>     raise Fail("There is no active namenodes.")
> resource_management.core.exceptions.Fail: There is no active namenodes.
> {code}
> 
> Although the NameNode does actually start, a failure is recorded in the request, stopping
the rest of the cluster from coming up. This probably because the JMX properties for Active
and Standby NameNode are different in HDP 2.0 vs HDP 2.1+:
> 
> {code:title=active jmx}
> {
>     "name" : "Hadoop:service=NameNode,name=FSNamesystem",
>     "modelerType" : "FSNamesystem",
>     "tag.Context" : "dfs",
>     "tag.HAState" : "active",
> {code}
> 
> {code:title=standby jmx}
> {
>     "name" : "Hadoop:service=NameNode,name=FSNamesystem",
>     "modelerType" : "FSNamesystem",
>     "tag.Context" : "dfs",
>     "tag.HAState" : "standby",
> {code}
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/functions/namenode_ha_utils.py
247d6e9 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode_upgrade.py
5e54593 
>   ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode_ha_state.py
f3185bf 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py e5da966 
> 
> Diff: https://reviews.apache.org/r/36645/diff/
> 
> 
> Testing
> -------
> 
> The same test fails on trunk, opened a jira
> 
> ======================================================================
> FAIL: test_attribute_environment_non_root (TestExecuteResource.TestExecuteResource)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/media/development/work/review_ambari/ambari-common/src/test/python/mock/mock.py",
line 1199, in patched
>     return func(*args, **keywargs)
>   File "/media/development/work/review_ambari/ambari-agent/src/test/python/resource_management/TestExecuteResource.py",
line 199, in test_attribute_environment_non_root
>     self.assertEqual(popen_mock.call_args_list[0][0][0], expected_command)
> AssertionError: Lists differ: ['/bin/bash', '--login', '--no... != ['/bin/bash', '--login',
'--no...
> 
> First differing element 4:
> ambari-sudo.sh su test_user -l -s /bin/bash -c 'export  PATH='"'"'/home/i/.pyenv/shims:/home/i/.pyenv/bin:/media/development/environment/alternatives/jdk7/bin:/media/development/environment/alternatives/maven31/bin:/home/i/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/media/development/work/irobot/snippets/bin/:~/.nvm/v0.10.31/bin/:~/.nvm/v0.10.31/lib/node_modules/brunch/bin/:/media/development/environment/android-sdk-linux/tools:/media/development/environment/android-sdk-linux/platform-tools:/home/i/bin:/media/development/environment/alternatives/scala/bin::/bin'"'"'
JAVA_HOME=/test/java/home ; echo "1"'
> ambari-sudo.sh su test_user -l -s /bin/bash -c 'export  PATH=/home/i/.pyenv/shims:/home/i/.pyenv/bin:/media/development/environment/alternatives/jdk7/bin:/media/development/environment/alternatives/maven31/bin:/home/i/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/media/development/work/irobot/snippets/bin/:~/.nvm/v0.10.31/bin/:~/.nvm/v0.10.31/lib/node_modules/brunch/bin/:/media/development/environment/android-sdk-linux/tools:/media/development/environment/android-sdk-linux/platform-tools:/home/i/bin:/media/development/environment/alternatives/scala/bin::/bin
JAVA_HOME=/test/java/home ; echo "1"'
> 
> Diff is 2026 characters long. Set self.maxDiff to None to see it.
> 
> ----------------------------------------------------------------------
> Ran 410 tests in 6.631s
> 
> FAILED (failures=1)
> [INFO] ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] Ambari Views ...................................... SUCCESS [2.572s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [1.153s]
> [INFO] Ambari Server ..................................... SUCCESS [42.835s]
> [INFO] Ambari Agent ...................................... FAILURE [8.186s]
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 55.480s
> [INFO] Finished at: Tue Jul 21 20:05:05 EEST 2015
> [INFO] Final Memory: 61M/1239M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message