Return-Path: X-Original-To: apmail-incubator-ambari-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-ambari-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 06113E48D for ; Tue, 5 Mar 2013 08:15:41 +0000 (UTC) Received: (qmail 36398 invoked by uid 500); 5 Mar 2013 08:15:40 -0000 Delivered-To: apmail-incubator-ambari-user-archive@incubator.apache.org Received: (qmail 36259 invoked by uid 500); 5 Mar 2013 08:15:40 -0000 Mailing-List: contact ambari-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ambari-user@incubator.apache.org Delivered-To: mailing list ambari-user@incubator.apache.org Received: (qmail 36232 invoked by uid 99); 5 Mar 2013 08:15:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Mar 2013 08:15:39 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [67.192.241.161] (HELO smtp161.dfw.emailsrvr.com) (67.192.241.161) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Mar 2013 08:15:32 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp26.relay.dfw1a.emailsrvr.com (SMTP Server) with ESMTP id 92F93806EA for ; Tue, 5 Mar 2013 03:15:11 -0500 (EST) X-Virus-Scanned: OK Received: by smtp26.relay.dfw1a.emailsrvr.com (Authenticated sender: dustine-AT-thecyberguardian.com) with ESMTPSA id 31431806EB for ; Tue, 5 Mar 2013 03:15:09 -0500 (EST) Message-ID: <5135A997.8060108@thecyberguardian.com> Date: Tue, 05 Mar 2013 16:15:19 +0800 From: Dustine Rene Bernasor User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: "ambari-user@incubator.apache.org" Subject: Re: Trouble during deploy References: <5135591A.7060601@thecyberguardian.com> <5135648A.907@thecyberguardian.com> <51358E51.7020205@thecyberguardian.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------030106080602080601030001" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------030106080602080601030001 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hello, I did a reset again because I didn't see any progress and two hours has already passed. This time, the installation looks good without anything strange (so far). Thanks. Dustine On 3/5/2013 3:14 PM, Yusaku Sako wrote: > Hi Dustine, > > That's a strange place for the install process to get stuck at. > Can you try page refresh on your browser? Does it continue making > progress? > If something fails, you would see the progress bar turn red (fatal > error) or orange (warning). > > Yusaku > > On Mon, Mar 4, 2013 at 10:18 PM, Dustine Rene Bernasor > > > wrote: > > Hello, > > I tried stopping the Ambari server, then resetting, then starting it. > Did everything from scratch and this time, after clicking the > Deploy button, > I am redirected to the Install, Start and Test page. Installation > proceeds > but after a certain point, I am stuck. > > Crawler51 9% Installing JobTracker > Crawler52 11% Installing HDFS Client > Crawler53 16% Installing MapReduce Client > > I am getting the following from stdout: > > warning: Could not retrieve fact fqdn > warning: Host is missing hostname and/or domain: crawler51 > warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/init.pp:161 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes. > warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:74 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes. > warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:83 is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes. > warning: Dynamic lookup of $ambari_db_server_host is deprecated. Support will be removed in Puppet 2.8. Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes. > notice: /Stage[1]/Hdp::Snappy::Package/Hdp::Snappy::Package::Ln[32]/Hdp::Exec[hdp::snappy::package::ln 32]/Exec[hdp::snappy::package::ln 32]/returns: executed successfully > notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[core-site]/File[/etc/hadoop/conf/core-site.xml]/content: content changed '{md5}aa21ba6ff20cc6766211e37e4f364395' to '{md5}4a8180bd03474a5be7e13a3530ab641a' > notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[mapred-site]/File[/etc/hadoop/conf/mapred-site.xml]/content: content changed '{md5}864fa2060a7271cca6769742fdf00b16' to '{md5}ae167014591c96734bba8a438f805548' > notice: Finished catalog run in 1.55 seconds > > > My nodes do not have an FQDN since I have no other IP I can use > for the domain. > > Thanks. > > Dustine > > > > > On 3/5/2013 11:20 AM, Dustine Rene Bernasor wrote: >> Hello Yusaku, >> >> When I click the Deploy button,a loader gif appears (sometimes) >> but I am stuck in the same screen. >> I am not redirected to the Install, Start and Test page. >> >> I will try to do the "ambari-server stop" first then reset then >> start and see if I still get the same problem. >> If I still get it, I might have to switch to 1.2.1 as you suggested. >> >> By the way, I have attached the ambari-server log. >> >> Thanks. >> >> Dustine >> >> On 3/5/2013 11:01 AM, Yusaku Sako wrote: >>> Hi Dustine, >>> >>> What happens after you click on the Deploy button? It just gets >>> stuck on the same screen? Or does it go to the "Install, Start >>> and Test" page with progress bars? >>> If you can post /var/log/ambari-server/ambari-server.log, it >>> would be helpful to troubleshoot. >>> >>> Also, it sounds like you are using Ambari 1.2.0? >>> With 1.2.0, you should "ambari-server stop", followed by >>> "ambari-server reset", then "ambari-server start" if deploy gets >>> stuck. Clear the browser cache and hit http://:8080. >>> >>> BTW, Ambari 1.2.1 handles retrying deploy much better than 1.2.0. >>> If deploy gets stuck for whatever reason, you can hit refresh on >>> the browser and hit "Deploy" again (no need to do "ambari-server >>> reset", etc). >>> You will not get a message saying you already have a cluster >>> with the same name, etc. >>> I highly recommend trying out 1.2.1, rather than 1.2.0 (if you >>> are not already). In addition to handling retries better, it >>> has 136 fixes over 1.2.0: >>> https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%20%221.2.1%22%20AND%20project%20%3D%20AMBARI >>> >>> Yusaku >>> >>> On Mon, Mar 4, 2013 at 6:31 PM, Dustine Rene Bernasor >>> >> > wrote: >>> >>> Hello, >>> >>> I am trying to deploy a Hadoop cluster with 3 nodes using >>> Ambari. >>> >>> This is my set-up: >>> >>> HDFS >>> NameNode: NodeA >>> SecondaryNameNode: NodeA >>> DataNodes: 2 hosts >>> >>> MapReduce >>> JobTracker: NodeA >>> TaskTracker: 2 hosts >>> >>> Nagios >>> Server: NodeA >>> >>> Ganglia >>> Server: NodeA >>> >>> However, after clicking the deploy button, the process seems >>> to be stuck. >>> >>> I got something like this on the server log: >>> >>> \"component\":\"JOBTRACKER\",\"hostName\":\"Crawler51\",\"serviceId\":\"MAPREDUCE\",\"isInstalled\":false},{\"display_name\":\"Nagios >>> Server\",\"component\":\"NAGIOS_SERVER\",\"hostName\":\"Crawler51\",\"serviceId\":\"NAGIOS\",\"isInstalled\":false},{\"display_name\":\"Ganglia >>> Collector\",\"component\":\"GANGLIA_SERVER\",\"hostName\":\"Crawler51\",\"serviceId\":\"GANGLIA\",\"isInstalled\":false}],\"slaveComponentHosts\":[{\"componentName\":\"DATANODE\",\"displayName\":\"DataNode\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]},{\"componentName\":\"TASKTRACKER\",\"displayName\":\"TaskTracker\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]},{\"componentName\":\"CLIENT\",\"displayName\":\"client\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]}]},\"AddHost\":{},\"AddService\":{}}}"} >>> >>> >>> So after waiting for hours and hours, I tried to do it all >>> over again. First I did a reset (ambari-server reset) on the >>> Ambari host >>> then did everything from scratch. When I reach the Deploy >>> part, this time, I get a message that a cluster with the >>> same name already exists. >>> >>> Here are my questions: >>> 1. What to do with the stuck deploy? >>> 2. How to remove the cluster that supposedly exist already? >>> When I log in to Ambari, I am redirected to the install wizard. >>> >>> >>> Thanks. >>> >>> Dustine >>> >>> >> > > --------------030106080602080601030001 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit
Hello,

I did a reset again because I didn't see any progress and two hours has already passed.

This time, the installation looks good without anything strange (so far).

Thanks.

Dustine

On 3/5/2013 3:14 PM, Yusaku Sako wrote:
Hi Dustine,

That's a strange place for the install process to get stuck at.
Can you try page refresh on your browser?  Does it continue making progress?
If something fails, you would see the progress bar turn red (fatal error) or orange (warning).

Yusaku

On Mon, Mar 4, 2013 at 10:18 PM, Dustine Rene Bernasor <dustine@thecyberguardian.com> wrote:
Hello,

I tried stopping the Ambari server, then resetting, then starting it.
Did everything from scratch and this time, after clicking the Deploy button,
I am redirected to the Install, Start and Test page. Installation proceeds
but after a certain point, I am stuck.

Crawler51 9% Installing JobTracker
Crawler52 11% Installing HDFS Client
Crawler53 16% Installing MapReduce Client

I am getting the following from stdout:

warning: Could not retrieve fact fqdn
warning: Host is missing hostname and/or domain: crawler51
warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/init.pp:161 is deprecated.  Support will be removed in Puppet 2.8.  Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:74 is deprecated.  Support will be removed in Puppet 2.8.  Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
warning: Dynamic lookup of $service_state at /var/lib/ambari-agent/puppet/modules/hdp-hadoop/manifests/service.pp:83 is deprecated.  Support will be removed in Puppet 2.8.  Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
warning: Dynamic lookup of $ambari_db_server_host is deprecated.  Support will be removed in Puppet 2.8.  Use a fully-qualified variable name (e.g., $classname::variable) or parameterized classes.
notice: /Stage[1]/Hdp::Snappy::Package/Hdp::Snappy::Package::Ln[32]/Hdp::Exec[hdp::snappy::package::ln 32]/Exec[hdp::snappy::package::ln 32]/returns: executed successfully
notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[core-site]/File[/etc/hadoop/conf/core-site.xml]/content: content changed '{md5}aa21ba6ff20cc6766211e37e4f364395' to '{md5}4a8180bd03474a5be7e13a3530ab641a'
notice: /Stage[2]/Hdp-hadoop::Initialize/Configgenerator::Configfile[mapred-site]/File[/etc/hadoop/conf/mapred-site.xml]/content: content changed '{md5}864fa2060a7271cca6769742fdf00b16' to '{md5}ae167014591c96734bba8a438f805548'
notice: Finished catalog run in 1.55 seconds

My nodes do not have an FQDN since I have no other IP I can use for the domain.

Thanks.

Dustine




On 3/5/2013 11:20 AM, Dustine Rene Bernasor wrote:
Hello Yusaku,

When I click the Deploy button,a loader gif appears (sometimes) but I am stuck in the same screen.
I am not redirected to the Install, Start and Test page.

I will try to do the "ambari-server stop" first then reset then start and see if I still get the same problem.
If I still get it, I might have to switch to 1.2.1 as you suggested.

By the way, I have attached the ambari-server log.

Thanks.

Dustine

On 3/5/2013 11:01 AM, Yusaku Sako wrote:
Hi Dustine,

What happens after you click on the Deploy button?  It just gets stuck on the same screen?  Or does it go to the "Install, Start and Test" page with progress bars?
If you can post /var/log/ambari-server/ambari-server.log, it would be helpful to troubleshoot.

Also, it sounds like you are using Ambari 1.2.0?
With 1.2.0, you should "ambari-server stop", followed by "ambari-server reset", then "ambari-server start" if deploy gets stuck. Clear the browser cache and hit http://<ambari-server>:8080.  

BTW, Ambari 1.2.1 handles retrying deploy much better than 1.2.0.
If deploy gets stuck for whatever reason, you can hit refresh on the browser and hit "Deploy" again (no need to do "ambari-server reset", etc).
You will not get a message saying you already have a cluster with the same name, etc.
I highly recommend trying out 1.2.1, rather than 1.2.0 (if you are not already).  In addition to handling retries better, it has 136 fixes over 1.2.0: https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%20%221.2.1%22%20AND%20project%20%3D%20AMBARI

Yusaku

On Mon, Mar 4, 2013 at 6:31 PM, Dustine Rene Bernasor <dustine@thecyberguardian.com> wrote:
Hello,

I am trying to deploy a Hadoop cluster with 3 nodes using Ambari.

This is my set-up:

HDFS
  NameNode: NodeA
  SecondaryNameNode: NodeA
  DataNodes: 2 hosts

MapReduce
  JobTracker: NodeA
  TaskTracker: 2 hosts

Nagios
  Server: NodeA

Ganglia
  Server: NodeA

However, after clicking the deploy button, the process seems to be stuck.

I got something like this on the server log:

\"component\":\"JOBTRACKER\",\"hostName\":\"Crawler51\",\"serviceId\":\"MAPREDUCE\",\"isInstalled\":false},{\"display_name\":\"Nagios Server\",\"component\":\"NAGIOS_SERVER\",\"hostName\":\"Crawler51\",\"serviceId\":\"NAGIOS\",\"isInstalled\":false},{\"display_name\":\"Ganglia Collector\",\"component\":\"GANGLIA_SERVER\",\"hostName\":\"Crawler51\",\"serviceId\":\"GANGLIA\",\"isInstalled\":false}],\"slaveComponentHosts\":[{\"componentName\":\"DATANODE\",\"displayName\":\"DataNode\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]},{\"componentName\":\"TASKTRACKER\",\"displayName\":\"TaskTracker\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]},{\"componentName\":\"CLIENT\",\"displayName\":\"client\",\"hosts\":[{\"hostName\":\"Crawler52\",\"group\":\"Default\",\"isInstalled\":false},{\"hostName\":\"Crawler53\",\"group\":\"Default\",\"isInstalled\":false}]}]},\"AddHost\":{},\"AddService\":{}}}"}


So after waiting for hours and hours, I tried to do it all over again. First I did a reset (ambari-server reset) on the Ambari host
then did everything from scratch. When I reach the Deploy part, this time, I get a message that a cluster with the same name already exists.

Here are my questions:
1. What to do with the stuck deploy?
2. How to remove the cluster that supposedly exist already? When I log in to Ambari, I am redirected to the install wizard.


Thanks.

Dustine






--------------030106080602080601030001--