ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vitalyi Brodetskyi" <vbrodets...@hortonworks.com>
Subject Re: Review Request 19589: Add host fails after upgrade from 1.4.4 to 1.5.0 as datanode install fails
Date Mon, 24 Mar 2014 16:53:34 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19589/
-----------------------------------------------------------

(Updated March 24, 2014, 4:53 p.m.)


Review request for Ambari, Dmitro Lisnichenko and Sumit Mohanty.


Bugs: AMBARI-5190
    https://issues.apache.org/jira/browse/AMBARI-5190


Repository: ambari


Description (updated)
-------

Install 1.4. and then uprgade to 1.5.0 and then add a host.

{noformat}
2014-03-22 01:31:04,089 - Error while executing command 'start':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 95, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/datanode.py",
line 36, in start
    datanode(action="start")
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py",
line 44, in datanode
    create_log_dir=True
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/utils.py",
line 63, in service
    not_if=service_is_up
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line
239, in action_run
    raise ex
Fail: Execution of 'ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode'
returned 1. starting datanode, logging to /var/log/hadoop/hdfs2/hadoop-hdfs2-datanode-c6403.ambari.apache.org.out
/usr/lib/hadoop-hdfs/bin/hdfs: line 201: /usr/jdk64/jdk1.6.0_31/bin/java: No such file or
directory
/usr/lib/hadoop-hdfs/bin/hdfs: line 201: exec: /usr/jdk64/jdk1.6.0_31/bin/java: cannot execute:
No such file or directory
{noformat}

Long output:
{noformat}
2014-03-22 01:30:59,754 - File['/etc/snmp/snmpd.conf'] {'content': Template('snmpd.conf.j2')}
2014-03-22 01:30:59,755 - Writing File['/etc/snmp/snmpd.conf'] because contents don't match
2014-03-22 01:30:59,755 - Service['snmpd'] {'action': ['restart']}
2014-03-22 01:30:59,780 - Service['snmpd'] command 'start'
2014-03-22 01:30:59,840 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if': 'test -f
/selinux/enforce'}
2014-03-22 01:30:59,852 - Skipping Execute['/bin/echo 0 > /selinux/enforce'] due to only_if
2014-03-22 01:30:59,854 - Execute['mkdir -p /usr/lib/hadoop/lib/native/Linux-i386-32; ln -sf
/usr/lib/libsnappy.so /usr/lib/hadoop/lib/native/Linux-i386-32/libsnappy.so'] {}
2014-03-22 01:30:59,866 - Execute['mkdir -p /usr/lib/hadoop/lib/native/Linux-amd64-64; ln
-sf /usr/lib64/libsnappy.so /usr/lib/hadoop/lib/native/Linux-amd64-64/libsnappy.so'] {}
2014-03-22 01:30:59,879 - Directory['/etc/hadoop/conf'] {'owner': 'root', 'group': 'root',
'recursive': True}
2014-03-22 01:30:59,880 - Directory['/var/log/hadoop'] {'owner': 'root', 'group': 'root',
'recursive': True}
2014-03-22 01:30:59,880 - Creating directory Directory['/var/log/hadoop']
2014-03-22 01:30:59,880 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root',
'recursive': True}
2014-03-22 01:30:59,880 - Creating directory Directory['/var/run/hadoop']
2014-03-22 01:30:59,881 - Directory['/tmp'] {'owner': 'hdfs2', 'recursive': True}
2014-03-22 01:30:59,881 - Changing owner for /tmp from 0 to hdfs2
2014-03-22 01:30:59,884 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'),
'owner': 'root', 'group': 'root', 'mode': 0644}
2014-03-22 01:30:59,885 - Writing File['/etc/security/limits.d/hdfs.conf'] because contents
don't match
2014-03-22 01:30:59,887 - File['/etc/hadoop/conf/taskcontroller.cfg'] {'content': Template('taskcontroller.cfg.j2'),
'owner': 'hdfs2'}
2014-03-22 01:30:59,888 - Writing File['/etc/hadoop/conf/taskcontroller.cfg'] because it doesn't
exist
2014-03-22 01:30:59,888 - Changing owner for /etc/hadoop/conf/taskcontroller.cfg from 0 to
hdfs2
2014-03-22 01:30:59,896 - File['/etc/hadoop/conf/hadoop-env.sh'] {'content': Template('hadoop-env.sh.j2'),
'owner': 'hdfs2'}
2014-03-22 01:30:59,896 - Writing File['/etc/hadoop/conf/hadoop-env.sh'] because contents
don't match
2014-03-22 01:30:59,897 - Changing owner for /etc/hadoop/conf/hadoop-env.sh from 0 to hdfs2
2014-03-22 01:30:59,898 - File['/etc/hadoop/conf/commons-logging.properties'] {'content':
Template('commons-logging.properties.j2'), 'owner': 'hdfs2'}
2014-03-22 01:30:59,898 - Writing File['/etc/hadoop/conf/commons-logging.properties'] because
it doesn't exist
2014-03-22 01:30:59,898 - Changing owner for /etc/hadoop/conf/commons-logging.properties from
0 to hdfs2
2014-03-22 01:30:59,901 - File['/etc/hadoop/conf/slaves'] {'content': Template('slaves.j2'),
'owner': 'hdfs2'}
2014-03-22 01:30:59,901 - Writing File['/etc/hadoop/conf/slaves'] because contents don't match
2014-03-22 01:30:59,901 - Changing owner for /etc/hadoop/conf/slaves from 0 to hdfs2
2014-03-22 01:30:59,903 - File['/etc/hadoop/conf/health_check'] {'content': Template('health_check-v2.j2'),
'owner': 'hdfs2'}
2014-03-22 01:30:59,903 - Writing File['/etc/hadoop/conf/health_check'] because it doesn't
exist
2014-03-22 01:30:59,903 - Changing owner for /etc/hadoop/conf/health_check from 0 to hdfs2
2014-03-22 01:30:59,903 - File['/etc/hadoop/conf/log4j.properties'] {'content': '...', 'owner':
'hdfs2', 'group': 'hadoop2', 'mode': 0644}
2014-03-22 01:30:59,904 - Writing File['/etc/hadoop/conf/log4j.properties'] because contents
don't match
2014-03-22 01:30:59,904 - Changing owner for /etc/hadoop/conf/log4j.properties from 0 to hdfs2
2014-03-22 01:30:59,904 - Changing group for /etc/hadoop/conf/log4j.properties from 0 to hadoop2
2014-03-22 01:30:59,907 - File['/etc/hadoop/conf/hadoop-metrics2.properties'] {'content':
Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs2'}
2014-03-22 01:30:59,908 - Writing File['/etc/hadoop/conf/hadoop-metrics2.properties'] because
contents don't match
2014-03-22 01:30:59,908 - Changing owner for /etc/hadoop/conf/hadoop-metrics2.properties from
0 to hdfs2
2014-03-22 01:30:59,908 - XmlConfig['core-site.xml'] {'owner': 'hdfs2', 'group': 'hadoop2',
'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
2014-03-22 01:30:59,914 - Generating config: /etc/hadoop/conf/core-site.xml
2014-03-22 01:30:59,914 - File['/etc/hadoop/conf/core-site.xml'] {'owner': 'hdfs2', 'content':
InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
2014-03-22 01:30:59,915 - Writing File['/etc/hadoop/conf/core-site.xml'] because contents
don't match
2014-03-22 01:30:59,915 - Changing owner for /etc/hadoop/conf/core-site.xml from 0 to hdfs2
2014-03-22 01:30:59,915 - Changing group for /etc/hadoop/conf/core-site.xml from 0 to hadoop2
2014-03-22 01:30:59,915 - XmlConfig['mapred-site.xml'] {'owner': 'mapred2', 'group': 'hadoop2',
'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
2014-03-22 01:30:59,919 - Generating config: /etc/hadoop/conf/mapred-site.xml
2014-03-22 01:30:59,919 - File['/etc/hadoop/conf/mapred-site.xml'] {'owner': 'mapred2', 'content':
InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
2014-03-22 01:30:59,920 - Writing File['/etc/hadoop/conf/mapred-site.xml'] because it doesn't
exist
2014-03-22 01:30:59,920 - Changing owner for /etc/hadoop/conf/mapred-site.xml from 0 to mapred2
2014-03-22 01:30:59,920 - Changing group for /etc/hadoop/conf/mapred-site.xml from 0 to hadoop2
2014-03-22 01:30:59,920 - File['/etc/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'),
'mode': 0755}
2014-03-22 01:30:59,921 - Writing File['/etc/hadoop/conf/task-log4j.properties'] because it
doesn't exist
2014-03-22 01:30:59,921 - Changing permission for /etc/hadoop/conf/task-log4j.properties from
644 to 755
2014-03-22 01:30:59,921 - XmlConfig['capacity-scheduler.xml'] {'owner': 'hdfs2', 'group':
'hadoop2', 'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
2014-03-22 01:30:59,924 - Generating config: /etc/hadoop/conf/capacity-scheduler.xml
2014-03-22 01:30:59,925 - File['/etc/hadoop/conf/capacity-scheduler.xml'] {'owner': 'hdfs2',
'content': InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
2014-03-22 01:30:59,925 - Writing File['/etc/hadoop/conf/capacity-scheduler.xml'] because
contents don't match
2014-03-22 01:30:59,925 - Changing owner for /etc/hadoop/conf/capacity-scheduler.xml from
0 to hdfs2
2014-03-22 01:30:59,925 - Changing group for /etc/hadoop/conf/capacity-scheduler.xml from
0 to hadoop2
2014-03-22 01:30:59,925 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs2', 'group': 'hadoop2',
'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
2014-03-22 01:30:59,928 - Generating config: /etc/hadoop/conf/hdfs-site.xml
2014-03-22 01:30:59,929 - File['/etc/hadoop/conf/hdfs-site.xml'] {'owner': 'hdfs2', 'content':
InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
2014-03-22 01:30:59,930 - Writing File['/etc/hadoop/conf/hdfs-site.xml'] because contents
don't match
2014-03-22 01:30:59,930 - Changing owner for /etc/hadoop/conf/hdfs-site.xml from 0 to hdfs2
2014-03-22 01:30:59,930 - Changing group for /etc/hadoop/conf/hdfs-site.xml from 0 to hadoop2
2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/configuration.xsl'] {'owner': 'hdfs2', 'group':
'hadoop2'}
2014-03-22 01:30:59,931 - Changing owner for /etc/hadoop/conf/configuration.xsl from 0 to
hdfs2
2014-03-22 01:30:59,931 - Changing group for /etc/hadoop/conf/configuration.xsl from 0 to
hadoop2
2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/ssl-client.xml.example'] {'owner': 'mapred2',
'group': 'hadoop2'}
2014-03-22 01:30:59,931 - Changing owner for /etc/hadoop/conf/ssl-client.xml.example from
0 to mapred2
2014-03-22 01:30:59,931 - Changing group for /etc/hadoop/conf/ssl-client.xml.example from
0 to hadoop2
2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/ssl-server.xml.example'] {'owner': 'mapred2',
'group': 'hadoop2'}
2014-03-22 01:30:59,932 - Changing owner for /etc/hadoop/conf/ssl-server.xml.example from
0 to mapred2
2014-03-22 01:30:59,932 - Changing group for /etc/hadoop/conf/ssl-server.xml.example from
0 to hadoop2
2014-03-22 01:31:00,014 - Directory['/var/lib/hadoop-hdfs'] {'owner': 'hdfs2', 'group': 'hadoop2',
'mode': 0751, 'recursive': True}
2014-03-22 01:31:00,015 - Changing permission for /var/lib/hadoop-hdfs from 755 to 751
2014-03-22 01:31:00,015 - Changing owner for /var/lib/hadoop-hdfs from 495 to hdfs2
2014-03-22 01:31:00,015 - Changing group for /var/lib/hadoop-hdfs from 496 to hadoop2
2014-03-22 01:31:00,015 - Directory['/hadoop/hdfs/data'] {'owner': 'hdfs2', 'group': 'hadoop2',
'mode': 0755, 'recursive': True}
2014-03-22 01:31:00,016 - Creating directory Directory['/hadoop/hdfs/data']
2014-03-22 01:31:00,016 - Changing owner for /hadoop/hdfs/data from 0 to hdfs2
2014-03-22 01:31:00,016 - Changing group for /hadoop/hdfs/data from 0 to hadoop2
2014-03-22 01:31:00,017 - Directory['/var/run/hadoop/hdfs2'] {'owner': 'hdfs2', 'recursive':
True}
2014-03-22 01:31:00,017 - Creating directory Directory['/var/run/hadoop/hdfs2']
2014-03-22 01:31:00,018 - Changing owner for /var/run/hadoop/hdfs2 from 0 to hdfs2
2014-03-22 01:31:00,018 - Directory['/var/log/hadoop/hdfs2'] {'owner': 'hdfs2', 'recursive':
True}
2014-03-22 01:31:00,018 - Creating directory Directory['/var/log/hadoop/hdfs2']
2014-03-22 01:31:00,018 - Changing owner for /var/log/hadoop/hdfs2 from 0 to hdfs2
2014-03-22 01:31:00,018 - File['/var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid'] {'action':
['delete'], 'not_if': 'ls /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid >/dev/null 2>&1
&& ps `cat /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid` >/dev/null 2>&1',
'ignore_failures': True}
2014-03-22 01:31:00,029 - Execute['ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode']
{'not_if': 'ls /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid >/dev/null 2>&1 &&
ps `cat /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid` >/dev/null 2>&1', 'user':
'hdfs2'}
2014-03-22 01:31:04,089 - Error while executing command 'start':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 95, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/datanode.py",
line 36, in start
    datanode(action="start")
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py",
line 44, in datanode
    create_log_dir=True
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/utils.py",
line 63, in service
    not_if=service_is_up
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149,
in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115,
in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line
239, in action_run
    raise ex
Fail: Execution of 'ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode'
returned 1. starting datanode, logging to /var/log/hadoop/hdfs2/hadoop-hdfs2-datanode-c6403.ambari.apache.org.out
/usr/lib/hadoop-hdfs/bin/hdfs: line 201: /usr/jdk64/jdk1.6.0_31/bin/java: No such file or
directory
/usr/lib/hadoop-hdfs/bin/hdfs: line 201: exec: /usr/jdk64/jdk1.6.0_31/bin/java: cannot execute:
No such file or directory
{noformat}

Upgrade from 1.4.4. to 1.5.0 has some issues. Contact [~smohanty]/[~mkonar] on how to upgrade.

FIX: As a fix i decided to add needed(for ambari 1.5.0) properties, such as: jdk_name and
jce_name.
These properties are talking about what exactly jdk and jce are using current cluster.
Till ambari 1.5.0 default jdk was 1.6.0.31 and jce version 6. That's why in my fix i'm checking

if we have default java home, if it is, then i'm adding properties with default jdk_name and
jce_name.
If user used custom jdk then this code will not work for him, and he should controll jdk and
jce by himself. 


Diffs
-----

  ambari-server/src/main/python/ambari-server.py 79eead7 
  ambari-server/src/test/python/TestAmbariServer.py 54e4077 

Diff: https://reviews.apache.org/r/19589/diff/


Testing
-------

----------------------------------------------------------------------
Ran 187 tests in 1.020s

OK
----------------------------------------------------------------------
Total run:501
Total errors:0
Total failures:0
OK


Thanks,

Vitalyi Brodetskyi


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message