ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-6261) agent reg failed with timeout but didn't error out. installer stuck
Date Tue, 24 Jun 2014 16:24:24 GMT
Andrew Onischuk created AMBARI-6261:
---------------------------------------

             Summary: agent reg failed with timeout but didn't error out. installer stuck
                 Key: AMBARI-6261
                 URL: https://issues.apache.org/jira/browse/AMBARI-6261
             Project: Ambari
          Issue Type: Bug
            Reporter: Andrew Onischuk
            Assignee: Andrew Onischuk
             Fix For: 1.6.1


Name : ambari-server  
Arch : noarch  
Version : 1.6.1  
Release : 72

2 out of 3 agents register fine. The remaining, timed out.

Problem #1: The error message didn't have the time but had the string replace
"timeout =

{0} seconds"  
Problem #2: even though I'm in a registration error situation, the UI still
says Installing...so I can't go back, remove the host or try again.  
  
{"status":"ERROR","hostsStatus":[{"hostName":"ip-10-164-165-204.ec2.internal",
"status":"RUNNING","log":"==========================\nCopying common functions
script...\n==========================\n\nCommand start time 2014-06-24
08:48:01\n\nWarning: Permanently added
'ip-10-164-165-204.ec2.internal,10.164.165.204' (RSA) to the list of known
hosts.\nscp /usr/lib/python2.6/site-
packages/ambari_commons\nhost=ip-10-164-165-204.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:48:26\n\n==========================\nCopying OS type check
script...\n==========================\n\nCommand start time 2014-06-24
08:48:26\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/os_check_type.py\nhost=ip-10-164-165-204.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:49:11\n\n==========================\nRunning OS type
check...\n==========================\n\nCommand start time 2014-06-24
08:49:11\nCluster primary/cluster OS type is redhat6 and local/current OS type
is redhat6\n\nConnection to ip-10-164-165-204.ec2.internal closed.\nSSH
command execution finished\nhost=ip-10-164-165-204.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:49:52\n\n==========================\nChecking 'sudo' package on remote
host...\n==========================\n\nCommand start time 2014-06-24
08:49:52\nsudo-1.8.6p3-12.el6.x86_64\n\nConnection to
ip-10-164-165-204.ec2.internal closed.\nSSH command execution
finished\nhost=ip-10-164-165-204.ec2.internal, exitcode=0\nCommand end time
2014-06-24 08:50:27\n\n==========================\nCopying repo file to 'tmp'
folder...\n==========================\n\nCommand start time 2014-06-24
08:50:27\n\nscp
/etc/yum.repos.d/ambari.repo\nhost=ip-10-164-165-204.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:50:58\n\n==========================\nMoving file to repo
dir...\n==========================\n\nCommand start time 2014-06-24
08:50:58\n\nConnection to ip-10-164-165-204.ec2.internal closed.\nSSH command
execution finished\nhost=ip-10-164-165-204.ec2.internal, exitcode=0\nCommand
end time 2014-06-24 08:51:33\n\n==========================\nCopying setup
script file...\n==========================\n\nCommand start time 2014-06-24
08:51:33\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/setupAgent.py\nhost=ip-10-164-165-204.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:52:13\n\n==========================\nRunning setup agent
script...\n==========================\n\nCommand start time 2014-06-24
08:52:13\nAutomatic Agent registration timed out (timeout = {0}

seconds). Check your network connectivity and retry registration, or use
manual agent registration."},

{"hostName":"ip-10-136-91-58.ec2.internal","status":"DONE","statusCode":"0","l
og":"==========================\nCopying common functions
script...\n==========================\n\nCommand start time 2014-06-24
08:48:01\n\nWarning: Permanently added
'ip-10-136-91-58.ec2.internal,10.136.91.58' (RSA) to the list of known
hosts.\nscp /usr/lib/python2.6/site-
packages/ambari_commons\nhost=ip-10-136-91-58.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:48:37\n\n==========================\nCopying OS type check
script...\n==========================\n\nCommand start time 2014-06-24
08:48:37\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/os_check_type.py\nhost=ip-10-136-91-58.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:49:22\n\n==========================\nRunning OS type
check...\n==========================\n\nCommand start time 2014-06-24
08:49:22\nCluster primary/cluster OS type is redhat6 and local/current OS type
is redhat6\n\nConnection to ip-10-136-91-58.ec2.internal closed.\nSSH command
execution finished\nhost=ip-10-136-91-58.ec2.internal, exitcode=0\nCommand end
time 2014-06-24 08:49:42\n\n==========================\nChecking 'sudo'
package on remote host...\n==========================\n\nCommand start time
2014-06-24 08:49:42\nsudo-1.8.6p3-12.el6.x86_64\n\nConnection to
ip-10-136-91-58.ec2.internal closed.\nSSH command execution
finished\nhost=ip-10-136-91-58.ec2.internal, exitcode=0\nCommand end time
2014-06-24 08:50:19\n\n==========================\nCopying repo file to 'tmp'
folder...\n==========================\n\nCommand start time 2014-06-24
08:50:19\n\nscp
/etc/yum.repos.d/ambari.repo\nhost=ip-10-136-91-58.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:50:54\n\n==========================\nMoving file to repo
dir...\n==========================\n\nCommand start time 2014-06-24
08:50:54\n\nConnection to ip-10-136-91-58.ec2.internal closed.\nSSH command
execution finished\nhost=ip-10-136-91-58.ec2.internal, exitcode=0\nCommand end
time 2014-06-24 08:51:35\n\n==========================\nCopying setup script
file...\n==========================\n\nCommand start time 2014-06-24
08:51:35\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/setupAgent.py\nhost=ip-10-136-91-58.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:52:00\n\n==========================\nRunning setup agent
script...\n==========================\n\nCommand start time 2014-06-24
08:52:00\n/bin/sh: /usr/sbin/ambari-agent: No such file or
directory\nRestarting ambari-agent\nVerifying Python version
compatibility...\nUsing python /usr/bin/python2.6\nambari-agent is not
running. No PID found at /var/run/ambari-agent/ambari-agent.pid\nVerifying
Python version compatibility...\nUsing python /usr/bin/python2.6\nChecking for
previously running Ambari Agent...\nStarting ambari-agent\nVerifying ambari-
agent process status...\nAmbari Agent successfully started\nAgent PID at:
/var/run/ambari-agent/ambari-agent.pid\nAgent out at: /var/log/ambari-agent
/ambari-agent.out\nAgent log at: /var/log/ambari-agent/ambari-
agent.log\n('INFO 2014-06-24 08:52:46,109 main.py:83 -
loglevel=logging.INFO\nINFO 2014-06-24 08:52:46,110 DataCleaner.py:36 - Data
cleanup thread started\nINFO 2014-06-24 08:52:46,111 DataCleaner.py:71 - Data
cleanup started\nINFO 2014-06-24 08:52:46,111 DataCleaner.py:73 - Data cleanup
finished\nINFO 2014-06-24 08:52:46,235 PingPortListener.py:51 - Ping port
listener started on port: 8670\nINFO 2014-06-24 08:52:46,236 main.py:227 -
Connecting to the server at: https://ip-10-164-165-204.ec2.internal:8440\nINFO
2014-06-24 08:52:46,236 NetUtil.py:72 - DEBUG: Trying to connect to the server
at https://ip-10-164-165-204.ec2.internal:8440\nINFO 2014-06-24 08:52:46,236
NetUtil.py:42 - Connecting to the following url
https://ip-10-164-165-204.ec2.internal:8440/cert/ca\n', None)\n\nConnection to
ip-10-136-91-58.ec2.internal closed.\nSSH command execution
finished\nhost=ip-10-136-91-58.ec2.internal, exitcode=0\nCommand end time
2014-06-24 08:52:48\n"}

,

{"hostName":"ip-10-95-170-54.ec2.internal","status":"DONE","statusCode":"0","l
og":"==========================\nCopying common functions
script...\n==========================\n\nCommand start time 2014-06-24
08:48:01\n\nWarning: Permanently added
'ip-10-95-170-54.ec2.internal,10.95.170.54' (RSA) to the list of known
hosts.\nscp /usr/lib/python2.6/site-
packages/ambari_commons\nhost=ip-10-95-170-54.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:48:17\n\n==========================\nCopying OS type check
script...\n==========================\n\nCommand start time 2014-06-24
08:48:17\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/os_check_type.py\nhost=ip-10-95-170-54.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:48:47\n\n==========================\nRunning OS type
check...\n==========================\n\nCommand start time 2014-06-24
08:48:47\nCluster primary/cluster OS type is redhat6 and local/current OS type
is redhat6\n\nConnection to ip-10-95-170-54.ec2.internal closed.\nSSH command
execution finished\nhost=ip-10-95-170-54.ec2.internal, exitcode=0\nCommand end
time 2014-06-24 08:49:22\n\n==========================\nChecking 'sudo'
package on remote host...\n==========================\n\nCommand start time
2014-06-24 08:49:22\nsudo-1.8.6p3-12.el6.x86_64\n\nConnection to
ip-10-95-170-54.ec2.internal closed.\nSSH command execution
finished\nhost=ip-10-95-170-54.ec2.internal, exitcode=0\nCommand end time
2014-06-24 08:49:58\n\n==========================\nCopying repo file to 'tmp'
folder...\n==========================\n\nCommand start time 2014-06-24
08:49:58\n\nscp
/etc/yum.repos.d/ambari.repo\nhost=ip-10-95-170-54.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:50:34\n\n==========================\nMoving file to repo
dir...\n==========================\n\nCommand start time 2014-06-24
08:50:34\n\nConnection to ip-10-95-170-54.ec2.internal closed.\nSSH command
execution finished\nhost=ip-10-95-170-54.ec2.internal, exitcode=0\nCommand end
time 2014-06-24 08:50:49\n\n==========================\nCopying setup script
file...\n==========================\n\nCommand start time 2014-06-24
08:50:49\n\nscp /usr/lib/python2.6/site-
packages/ambari_server/setupAgent.py\nhost=ip-10-95-170-54.ec2.internal,
exitcode=0\nCommand end time 2014-06-24
08:51:20\n\n==========================\nRunning setup agent
script...\n==========================\n\nCommand start time 2014-06-24
08:51:20\n/bin/sh: /usr/sbin/ambari-agent: No such file or
directory\nRestarting ambari-agent\nVerifying Python version
compatibility...\nUsing python /usr/bin/python2.6\nambari-agent is not
running. No PID found at /var/run/ambari-agent/ambari-agent.pid\nVerifying
Python version compatibility...\nUsing python /usr/bin/python2.6\nChecking for
previously running Ambari Agent...\nStarting ambari-agent\nVerifying ambari-
agent process status...\nAmbari Agent successfully started\nAgent PID at:
/var/run/ambari-agent/ambari-agent.pid\nAgent out at: /var/log/ambari-agent
/ambari-agent.out\nAgent log at: /var/log/ambari-agent/ambari-
agent.log\n('INFO 2014-06-24 08:51:45,593 main.py:83 -
loglevel=logging.INFO\nINFO 2014-06-24 08:51:45,593 DataCleaner.py:36 - Data
cleanup thread started\nINFO 2014-06-24 08:51:45,594 DataCleaner.py:71 - Data
cleanup started\nINFO 2014-06-24 08:51:45,595 DataCleaner.py:73 - Data cleanup
finished\nINFO 2014-06-24 08:51:45,722 PingPortListener.py:51 - Ping port
listener started on port: 8670\nINFO 2014-06-24 08:51:45,723 main.py:227 -
Connecting to the server at: https://ip-10-164-165-204.ec2.internal:8440\nINFO
2014-06-24 08:51:45,723 NetUtil.py:72 - DEBUG: Trying to connect to the server
at https://ip-10-164-165-204.ec2.internal:8440\nINFO 2014-06-24 08:51:45,723
NetUtil.py:42 - Connecting to the following url
https://ip-10-164-165-204.ec2.internal:8440/cert/ca\n', None)\n\nConnection to
ip-10-95-170-54.ec2.internal closed.\nSSH command execution
finished\nhost=ip-10-95-170-54.ec2.internal, exitcode=0\nCommand end time
2014-06-24 08:51:48\n"}

],"log":"\n\nINFO:root:BootStrapping hosts
['ip-10-164-165-204.ec2.internal',\n 'ip-10-136-91-58.ec2.internal',\n
'ip-10-95-170-54.ec2.internal'] using /usr/lib/python2.6/site-
packages/ambari_server cluster primary OS: redhat6 with user 'ec2-user' sshKey
File /var/run/ambari-server/bootstrap/1/sshKey password File null using tmp
dir /var/run/ambari-server/bootstrap/1 ambari: ip-10-164-165-204.ec2.internal;
server_port: 8080; ambari version: 1.6.1\nINFO:root:Executing parallel
bootstrap\nWARNING:root:Bootstrap at host ip-10-164-165-204.ec2.internal timed
out and will be interrupted\nTraceback (most recent call last):\n File
\"/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py\", line 660, in
<module>\n main(sys.argv)\n File \"/usr/lib/python2.6/site-
packages/ambari_server/bootstrap.py\", line 655, in main\n pbootstrap.run()\n
File \"/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py\", line
582, in run\n bootstrap.interruptBootstrap()\n File \"/usr/lib/python2.6/site-
packages/ambari_server/bootstrap.py\", line 544, in interruptBootstrap\n
self.host_log.write(\"Automatic Agent registration timed out (timeout =

{0}

seconds). \" \\\\\nAttributeError: 'NoneType' object has no attribute
'format'\n"}





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message