ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitro Lisnichenko" <dlysniche...@hortonworks.com>
Subject Review Request 36645: Unable to Start NameNode in HA Mode On HDP 2.0
Date Tue, 21 Jul 2015 17:10:27 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36645/
-----------------------------------------------------------

Review request for Ambari, Jonathan Hurley and Vitalyi Brodetskyi.


Bugs: AMBARI-12374
    https://issues.apache.org/jira/browse/AMBARI-12374


Repository: ambari


Description
-------

When starting an HA NameNode cluster on HDP 2.0, the following error is seen:

{code}
2015-07-07 16:02:56,371 - Getting jmx metrics from NN failed. URL: http://c6401.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
    return data_dict["beans"][0][property]
IndexError: list index out of range
2015-07-07 16:02:56,396 - Getting jmx metrics from NN failed. URL: http://c6402.ambari.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py",
line 41, in get_value_from_jmx
    return data_dict["beans"][0][property]
IndexError: list index out of range
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 316, in <module>
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 216, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 81, in start
    namenode(action="start", rolling_restart=rolling_restart, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py",
line 141, in namenode
    create_hdfs_directories(is_active_namenode_cmd)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py",
line 198, in create_hdfs_directories
    only_if=check
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 390, in action_create_on_execute
    self.action_delayed("create")
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 387, in action_delayed
    self.get_hdfs_resource_executor().action_delayed(action_name, self)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 239, in action_delayed
    main_resource.resource.security_enabled, main_resource.resource.logoutput)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
line 126, in __init__
    security_enabled, run_user)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/namenode_ha_utils.py",
line 113, in get_property_for_active_namenode
    raise Fail("There is no active namenodes.")
resource_management.core.exceptions.Fail: There is no active namenodes.
{code}

Although the NameNode does actually start, a failure is recorded in the request, stopping
the rest of the cluster from coming up. This probably because the JMX properties for Active
and Standby NameNode are different in HDP 2.0 vs HDP 2.1+:

{code:title=active jmx}
{
    "name" : "Hadoop:service=NameNode,name=FSNamesystem",
    "modelerType" : "FSNamesystem",
    "tag.Context" : "dfs",
    "tag.HAState" : "active",
{code}

{code:title=standby jmx}
{
    "name" : "Hadoop:service=NameNode,name=FSNamesystem",
    "modelerType" : "FSNamesystem",
    "tag.Context" : "dfs",
    "tag.HAState" : "standby",
{code}


Diffs
-----

  ambari-common/src/main/python/resource_management/libraries/functions/namenode_ha_utils.py
247d6e9 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode_upgrade.py
5e54593 
  ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode_ha_state.py
f3185bf 
  ambari-server/src/test/python/stacks/2.0.6/HDFS/test_journalnode.py e5da966 

Diff: https://reviews.apache.org/r/36645/diff/


Testing
-------

======================================================================
FAIL: test_attribute_environment_non_root (TestExecuteResource.TestExecuteResource)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/media/development/work/review_ambari/ambari-common/src/test/python/mock/mock.py",
line 1199, in patched
    return func(*args, **keywargs)
  File "/media/development/work/review_ambari/ambari-agent/src/test/python/resource_management/TestExecuteResource.py",
line 199, in test_attribute_environment_non_root
    self.assertEqual(popen_mock.call_args_list[0][0][0], expected_command)
AssertionError: Lists differ: ['/bin/bash', '--login', '--no... != ['/bin/bash', '--login',
'--no...

First differing element 4:
ambari-sudo.sh su test_user -l -s /bin/bash -c 'export  PATH='"'"'/home/i/.pyenv/shims:/home/i/.pyenv/bin:/media/development/environment/alternatives/jdk7/bin:/media/development/environment/alternatives/maven31/bin:/home/i/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/media/development/work/irobot/snippets/bin/:~/.nvm/v0.10.31/bin/:~/.nvm/v0.10.31/lib/node_modules/brunch/bin/:/media/development/environment/android-sdk-linux/tools:/media/development/environment/android-sdk-linux/platform-tools:/home/i/bin:/media/development/environment/alternatives/scala/bin::/bin'"'"'
JAVA_HOME=/test/java/home ; echo "1"'
ambari-sudo.sh su test_user -l -s /bin/bash -c 'export  PATH=/home/i/.pyenv/shims:/home/i/.pyenv/bin:/media/development/environment/alternatives/jdk7/bin:/media/development/environment/alternatives/maven31/bin:/home/i/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/media/development/work/irobot/snippets/bin/:~/.nvm/v0.10.31/bin/:~/.nvm/v0.10.31/lib/node_modules/brunch/bin/:/media/development/environment/android-sdk-linux/tools:/media/development/environment/android-sdk-linux/platform-tools:/home/i/bin:/media/development/environment/alternatives/scala/bin::/bin
JAVA_HOME=/test/java/home ; echo "1"'

Diff is 2026 characters long. Set self.maxDiff to None to see it.

----------------------------------------------------------------------
Ran 410 tests in 6.631s

FAILED (failures=1)
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Views ...................................... SUCCESS [2.572s]
[INFO] Ambari Metrics Common ............................. SUCCESS [1.153s]
[INFO] Ambari Server ..................................... SUCCESS [42.835s]
[INFO] Ambari Agent ...................................... FAILURE [8.186s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 55.480s
[INFO] Finished at: Tue Jul 21 20:05:05 EEST 2015
[INFO] Final Memory: 61M/1239M
[INFO] ------------------------------------------------------------------------


Thanks,

Dmitro Lisnichenko


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message