ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sumit Mohanty" <smoha...@hortonworks.com>
Subject Re: Review Request 19589: Add host fails after upgrade from 1.4.4 to 1.5.0 as datanode install fails
Date Mon, 24 Mar 2014 18:31:18 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19589/#review38334
-----------------------------------------------------------

Ship it!


Looks good. Thanks for the explanation - basically, I was not very clear about what will happen
when custom jdk is used.

- Sumit Mohanty


On March 24, 2014, 5:13 p.m., Vitalyi Brodetskyi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19589/
> -----------------------------------------------------------
> 
> (Updated March 24, 2014, 5:13 p.m.)
> 
> 
> Review request for Ambari, Dmitro Lisnichenko and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-5190
>     https://issues.apache.org/jira/browse/AMBARI-5190
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Install 1.4. and then uprgade to 1.5.0 and then add a host.
> 
> {noformat}
> 2014-03-22 01:31:04,089 - Error while executing command 'start':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 95, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/datanode.py",
line 36, in start
>     datanode(action="start")
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py",
line 44, in datanode
>     create_log_dir=True
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/utils.py",
line 63, in service
>     not_if=service_is_up
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
149, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
115, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 239, in action_run
>     raise ex
> Fail: Execution of 'ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode'
returned 1. starting datanode, logging to /var/log/hadoop/hdfs2/hadoop-hdfs2-datanode-c6403.ambari.apache.org.out
> /usr/lib/hadoop-hdfs/bin/hdfs: line 201: /usr/jdk64/jdk1.6.0_31/bin/java: No such file
or directory
> /usr/lib/hadoop-hdfs/bin/hdfs: line 201: exec: /usr/jdk64/jdk1.6.0_31/bin/java: cannot
execute: No such file or directory
> {noformat}
> 
> Long output:
> {noformat}
> 2014-03-22 01:30:59,754 - File['/etc/snmp/snmpd.conf'] {'content': Template('snmpd.conf.j2')}
> 2014-03-22 01:30:59,755 - Writing File['/etc/snmp/snmpd.conf'] because contents don't
match
> 2014-03-22 01:30:59,755 - Service['snmpd'] {'action': ['restart']}
> 2014-03-22 01:30:59,780 - Service['snmpd'] command 'start'
> 2014-03-22 01:30:59,840 - Execute['/bin/echo 0 > /selinux/enforce'] {'only_if': 'test
-f /selinux/enforce'}
> 2014-03-22 01:30:59,852 - Skipping Execute['/bin/echo 0 > /selinux/enforce'] due to
only_if
> 2014-03-22 01:30:59,854 - Execute['mkdir -p /usr/lib/hadoop/lib/native/Linux-i386-32;
ln -sf /usr/lib/libsnappy.so /usr/lib/hadoop/lib/native/Linux-i386-32/libsnappy.so'] {}
> 2014-03-22 01:30:59,866 - Execute['mkdir -p /usr/lib/hadoop/lib/native/Linux-amd64-64;
ln -sf /usr/lib64/libsnappy.so /usr/lib/hadoop/lib/native/Linux-amd64-64/libsnappy.so'] {}
> 2014-03-22 01:30:59,879 - Directory['/etc/hadoop/conf'] {'owner': 'root', 'group': 'root',
'recursive': True}
> 2014-03-22 01:30:59,880 - Directory['/var/log/hadoop'] {'owner': 'root', 'group': 'root',
'recursive': True}
> 2014-03-22 01:30:59,880 - Creating directory Directory['/var/log/hadoop']
> 2014-03-22 01:30:59,880 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root',
'recursive': True}
> 2014-03-22 01:30:59,880 - Creating directory Directory['/var/run/hadoop']
> 2014-03-22 01:30:59,881 - Directory['/tmp'] {'owner': 'hdfs2', 'recursive': True}
> 2014-03-22 01:30:59,881 - Changing owner for /tmp from 0 to hdfs2
> 2014-03-22 01:30:59,884 - File['/etc/security/limits.d/hdfs.conf'] {'content': Template('hdfs.conf.j2'),
'owner': 'root', 'group': 'root', 'mode': 0644}
> 2014-03-22 01:30:59,885 - Writing File['/etc/security/limits.d/hdfs.conf'] because contents
don't match
> 2014-03-22 01:30:59,887 - File['/etc/hadoop/conf/taskcontroller.cfg'] {'content': Template('taskcontroller.cfg.j2'),
'owner': 'hdfs2'}
> 2014-03-22 01:30:59,888 - Writing File['/etc/hadoop/conf/taskcontroller.cfg'] because
it doesn't exist
> 2014-03-22 01:30:59,888 - Changing owner for /etc/hadoop/conf/taskcontroller.cfg from
0 to hdfs2
> 2014-03-22 01:30:59,896 - File['/etc/hadoop/conf/hadoop-env.sh'] {'content': Template('hadoop-env.sh.j2'),
'owner': 'hdfs2'}
> 2014-03-22 01:30:59,896 - Writing File['/etc/hadoop/conf/hadoop-env.sh'] because contents
don't match
> 2014-03-22 01:30:59,897 - Changing owner for /etc/hadoop/conf/hadoop-env.sh from 0 to
hdfs2
> 2014-03-22 01:30:59,898 - File['/etc/hadoop/conf/commons-logging.properties'] {'content':
Template('commons-logging.properties.j2'), 'owner': 'hdfs2'}
> 2014-03-22 01:30:59,898 - Writing File['/etc/hadoop/conf/commons-logging.properties']
because it doesn't exist
> 2014-03-22 01:30:59,898 - Changing owner for /etc/hadoop/conf/commons-logging.properties
from 0 to hdfs2
> 2014-03-22 01:30:59,901 - File['/etc/hadoop/conf/slaves'] {'content': Template('slaves.j2'),
'owner': 'hdfs2'}
> 2014-03-22 01:30:59,901 - Writing File['/etc/hadoop/conf/slaves'] because contents don't
match
> 2014-03-22 01:30:59,901 - Changing owner for /etc/hadoop/conf/slaves from 0 to hdfs2
> 2014-03-22 01:30:59,903 - File['/etc/hadoop/conf/health_check'] {'content': Template('health_check-v2.j2'),
'owner': 'hdfs2'}
> 2014-03-22 01:30:59,903 - Writing File['/etc/hadoop/conf/health_check'] because it doesn't
exist
> 2014-03-22 01:30:59,903 - Changing owner for /etc/hadoop/conf/health_check from 0 to
hdfs2
> 2014-03-22 01:30:59,903 - File['/etc/hadoop/conf/log4j.properties'] {'content': '...',
'owner': 'hdfs2', 'group': 'hadoop2', 'mode': 0644}
> 2014-03-22 01:30:59,904 - Writing File['/etc/hadoop/conf/log4j.properties'] because contents
don't match
> 2014-03-22 01:30:59,904 - Changing owner for /etc/hadoop/conf/log4j.properties from 0
to hdfs2
> 2014-03-22 01:30:59,904 - Changing group for /etc/hadoop/conf/log4j.properties from 0
to hadoop2
> 2014-03-22 01:30:59,907 - File['/etc/hadoop/conf/hadoop-metrics2.properties'] {'content':
Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs2'}
> 2014-03-22 01:30:59,908 - Writing File['/etc/hadoop/conf/hadoop-metrics2.properties']
because contents don't match
> 2014-03-22 01:30:59,908 - Changing owner for /etc/hadoop/conf/hadoop-metrics2.properties
from 0 to hdfs2
> 2014-03-22 01:30:59,908 - XmlConfig['core-site.xml'] {'owner': 'hdfs2', 'group': 'hadoop2',
'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
> 2014-03-22 01:30:59,914 - Generating config: /etc/hadoop/conf/core-site.xml
> 2014-03-22 01:30:59,914 - File['/etc/hadoop/conf/core-site.xml'] {'owner': 'hdfs2', 'content':
InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
> 2014-03-22 01:30:59,915 - Writing File['/etc/hadoop/conf/core-site.xml'] because contents
don't match
> 2014-03-22 01:30:59,915 - Changing owner for /etc/hadoop/conf/core-site.xml from 0 to
hdfs2
> 2014-03-22 01:30:59,915 - Changing group for /etc/hadoop/conf/core-site.xml from 0 to
hadoop2
> 2014-03-22 01:30:59,915 - XmlConfig['mapred-site.xml'] {'owner': 'mapred2', 'group':
'hadoop2', 'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
> 2014-03-22 01:30:59,919 - Generating config: /etc/hadoop/conf/mapred-site.xml
> 2014-03-22 01:30:59,919 - File['/etc/hadoop/conf/mapred-site.xml'] {'owner': 'mapred2',
'content': InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
> 2014-03-22 01:30:59,920 - Writing File['/etc/hadoop/conf/mapred-site.xml'] because it
doesn't exist
> 2014-03-22 01:30:59,920 - Changing owner for /etc/hadoop/conf/mapred-site.xml from 0
to mapred2
> 2014-03-22 01:30:59,920 - Changing group for /etc/hadoop/conf/mapred-site.xml from 0
to hadoop2
> 2014-03-22 01:30:59,920 - File['/etc/hadoop/conf/task-log4j.properties'] {'content':
StaticFile('task-log4j.properties'), 'mode': 0755}
> 2014-03-22 01:30:59,921 - Writing File['/etc/hadoop/conf/task-log4j.properties'] because
it doesn't exist
> 2014-03-22 01:30:59,921 - Changing permission for /etc/hadoop/conf/task-log4j.properties
from 644 to 755
> 2014-03-22 01:30:59,921 - XmlConfig['capacity-scheduler.xml'] {'owner': 'hdfs2', 'group':
'hadoop2', 'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
> 2014-03-22 01:30:59,924 - Generating config: /etc/hadoop/conf/capacity-scheduler.xml
> 2014-03-22 01:30:59,925 - File['/etc/hadoop/conf/capacity-scheduler.xml'] {'owner': 'hdfs2',
'content': InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
> 2014-03-22 01:30:59,925 - Writing File['/etc/hadoop/conf/capacity-scheduler.xml'] because
contents don't match
> 2014-03-22 01:30:59,925 - Changing owner for /etc/hadoop/conf/capacity-scheduler.xml
from 0 to hdfs2
> 2014-03-22 01:30:59,925 - Changing group for /etc/hadoop/conf/capacity-scheduler.xml
from 0 to hadoop2
> 2014-03-22 01:30:59,925 - XmlConfig['hdfs-site.xml'] {'owner': 'hdfs2', 'group': 'hadoop2',
'conf_dir': '/etc/hadoop/conf', 'configurations': ...}
> 2014-03-22 01:30:59,928 - Generating config: /etc/hadoop/conf/hdfs-site.xml
> 2014-03-22 01:30:59,929 - File['/etc/hadoop/conf/hdfs-site.xml'] {'owner': 'hdfs2', 'content':
InlineTemplate(...), 'group': 'hadoop2', 'mode': None}
> 2014-03-22 01:30:59,930 - Writing File['/etc/hadoop/conf/hdfs-site.xml'] because contents
don't match
> 2014-03-22 01:30:59,930 - Changing owner for /etc/hadoop/conf/hdfs-site.xml from 0 to
hdfs2
> 2014-03-22 01:30:59,930 - Changing group for /etc/hadoop/conf/hdfs-site.xml from 0 to
hadoop2
> 2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/configuration.xsl'] {'owner': 'hdfs2',
'group': 'hadoop2'}
> 2014-03-22 01:30:59,931 - Changing owner for /etc/hadoop/conf/configuration.xsl from
0 to hdfs2
> 2014-03-22 01:30:59,931 - Changing group for /etc/hadoop/conf/configuration.xsl from
0 to hadoop2
> 2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/ssl-client.xml.example'] {'owner': 'mapred2',
'group': 'hadoop2'}
> 2014-03-22 01:30:59,931 - Changing owner for /etc/hadoop/conf/ssl-client.xml.example
from 0 to mapred2
> 2014-03-22 01:30:59,931 - Changing group for /etc/hadoop/conf/ssl-client.xml.example
from 0 to hadoop2
> 2014-03-22 01:30:59,931 - File['/etc/hadoop/conf/ssl-server.xml.example'] {'owner': 'mapred2',
'group': 'hadoop2'}
> 2014-03-22 01:30:59,932 - Changing owner for /etc/hadoop/conf/ssl-server.xml.example
from 0 to mapred2
> 2014-03-22 01:30:59,932 - Changing group for /etc/hadoop/conf/ssl-server.xml.example
from 0 to hadoop2
> 2014-03-22 01:31:00,014 - Directory['/var/lib/hadoop-hdfs'] {'owner': 'hdfs2', 'group':
'hadoop2', 'mode': 0751, 'recursive': True}
> 2014-03-22 01:31:00,015 - Changing permission for /var/lib/hadoop-hdfs from 755 to 751
> 2014-03-22 01:31:00,015 - Changing owner for /var/lib/hadoop-hdfs from 495 to hdfs2
> 2014-03-22 01:31:00,015 - Changing group for /var/lib/hadoop-hdfs from 496 to hadoop2
> 2014-03-22 01:31:00,015 - Directory['/hadoop/hdfs/data'] {'owner': 'hdfs2', 'group':
'hadoop2', 'mode': 0755, 'recursive': True}
> 2014-03-22 01:31:00,016 - Creating directory Directory['/hadoop/hdfs/data']
> 2014-03-22 01:31:00,016 - Changing owner for /hadoop/hdfs/data from 0 to hdfs2
> 2014-03-22 01:31:00,016 - Changing group for /hadoop/hdfs/data from 0 to hadoop2
> 2014-03-22 01:31:00,017 - Directory['/var/run/hadoop/hdfs2'] {'owner': 'hdfs2', 'recursive':
True}
> 2014-03-22 01:31:00,017 - Creating directory Directory['/var/run/hadoop/hdfs2']
> 2014-03-22 01:31:00,018 - Changing owner for /var/run/hadoop/hdfs2 from 0 to hdfs2
> 2014-03-22 01:31:00,018 - Directory['/var/log/hadoop/hdfs2'] {'owner': 'hdfs2', 'recursive':
True}
> 2014-03-22 01:31:00,018 - Creating directory Directory['/var/log/hadoop/hdfs2']
> 2014-03-22 01:31:00,018 - Changing owner for /var/log/hadoop/hdfs2 from 0 to hdfs2
> 2014-03-22 01:31:00,018 - File['/var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid'] {'action':
['delete'], 'not_if': 'ls /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid >/dev/null 2>&1
&& ps `cat /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid` >/dev/null 2>&1',
'ignore_failures': True}
> 2014-03-22 01:31:00,029 - Execute['ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode']
{'not_if': 'ls /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid >/dev/null 2>&1 &&
ps `cat /var/run/hadoop/hdfs2/hadoop-hdfs2-datanode.pid` >/dev/null 2>&1', 'user':
'hdfs2'}
> 2014-03-22 01:31:04,089 - Error while executing command 'start':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 95, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/datanode.py",
line 36, in start
>     datanode(action="start")
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py",
line 44, in datanode
>     create_log_dir=True
>   File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/utils.py",
line 63, in service
>     not_if=service_is_up
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
149, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
115, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 239, in action_run
>     raise ex
> Fail: Execution of 'ulimit -c unlimited;  export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec
&& /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode'
returned 1. starting datanode, logging to /var/log/hadoop/hdfs2/hadoop-hdfs2-datanode-c6403.ambari.apache.org.out
> /usr/lib/hadoop-hdfs/bin/hdfs: line 201: /usr/jdk64/jdk1.6.0_31/bin/java: No such file
or directory
> /usr/lib/hadoop-hdfs/bin/hdfs: line 201: exec: /usr/jdk64/jdk1.6.0_31/bin/java: cannot
execute: No such file or directory
> {noformat}
> 
> Upgrade from 1.4.4. to 1.5.0 has some issues. Contact [~smohanty]/[~mkonar] on how to
upgrade.
> 
> FIX: This issue apears because we have no needed properties(jdk.name and jce.name) after
upgrade to 1.5.0. 
> These properties we are adding to config, and then agents using them to install jdk and
jce.  
> As a fix i decided to add needed(for ambari 1.5.0) properties, such as: jdk.name and
jce.name.
> These properties are talking about what exactly jdk and jce are using current cluster.
> Till ambari 1.5.0 default jdk was 1.6.0.31 and jce version 6. That's why in my fix, i'm
checking, 
> if we have default java home, if it is, then i'm adding properties with default jdk.name
and jce.name.
> If user used custom jdk then this code will not work for him, and he should controll
jdk and jce by himself. 
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/python/ambari-server.py 79eead7 
>   ambari-server/src/test/python/TestAmbariServer.py 54e4077 
> 
> Diff: https://reviews.apache.org/r/19589/diff/
> 
> 
> Testing
> -------
> 
> ----------------------------------------------------------------------
> Ran 187 tests in 1.020s
> 
> OK
> ----------------------------------------------------------------------
> Total run:501
> Total errors:0
> Total failures:0
> OK
> 
> 
> Thanks,
> 
> Vitalyi Brodetskyi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message