ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artem Ervits <artemerv...@gmail.com>
Subject Re: Any way to reset Ambari Install Wizard?
Date Wed, 11 Nov 2015 11:58:34 GMT
Gets a recent troubleshooting guide for Ambari
http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_troubleshooting/content/index.html
On Nov 10, 2015 7:00 PM, "Ken Barclay" <kbarclay@ancestry.com> wrote:

> Hi Artem,
>
> We did find some references to kafka in some of the tables you mentioned
> even after we had deleted kafka and kafka_broker from the cluster. So we
> deleted them all, and tried again to install kafka through the Ambari UI,
> but we still get stuck on the same ‘recommended configurations’ page.
> Actually this is true for any service we try to install, not just kafka.
>
> We’d like to better understand the schema for Ambari: do you have some
> documentation on that we could go over?
>
> Thanks
> Ken
>
> From: <dbist13@gmail.com> on behalf of Artem Ervits <artemervits@gmail.com
> >
> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
> Date: Friday, October 30, 2015 at 5:48 PM
> To: "user@ambari.apache.org" <user@ambari.apache.org>
> Cc: David Sidlo <DSidlo@ancestry.com>
> Subject: Re: Any way to reset Ambari Install Wizard?
>
> yep, I'd go to database and start deleting those records, check hostcomponentstate,
> hostdesiredstate, servicedesiredstate and I believe servicecomponentstate.
> You can take a backup of the database if you're concerned.
>
> On Fri, Oct 30, 2015 at 7:43 PM, Ken Barclay <kbarclay@ancestry.com>
> wrote:
>
>> Hi Artem/Adam,
>>
>> Thanks very much for your input on this issue: we tried a few other
>> things today.
>>
>> We were finally able to install Spark through the Ambari UI: it was
>> listed in a failed state in the UI, so after doing the upgrade to 2.1.2, I
>> just tried the install again, and this time the install was successful and
>> I was able to start the service.
>>
>> Next we tried to install Kafka.
>> When we install Kafka through Ambari GUI for the first time, we get stuck
>> in that weird state I mentioned last time, where it won’t proceed beyond
>> “recommended configurations". Ambari shows it has a Kafka service – the
>> Broker doesn’t get installed, and there’s no configuration on the file
>> system.
>>
>> If we delete Kafka through the API and re-install through the API, we
>> could install Kafka service and component, then install the component to a
>> node and it installed successfully. There were configuration files created
>> on the file system in /etc/kafka also, but the the configurations were
>> blank in the Ambari UI. (I tried restarting ambari-server but there was no
>> change.) Kafka would not start however, probably because configurations
>> were missing, and Ambari would not allow us to add or set up the
>> configuration through the UI: it’s just blank.
>>
>> We can see there’s some duplicate key issue in tables (pasted below) when
>> we try to perform these INSERT and DELETE operations. We’re tailing the
>> postgres log.
>>
>> At this point we’ve deleted the service components from the cluster and
>> we’re trying to track down the entries in the tables so we can delete
>> entries associated with the kafka service.
>> We’ll attempt to re-install if we find records that prove to be in the
>> way.
>>
>> We noticed in the logs that when we install Kafka the message “kafka-env
>> not found in dictionary” (below), which seems to show there’s a disconnect
>> between service configuration templates and actual service configurations.
>> When we had trouble installing Spark a while ago we saw this same message,
>> except it was “spark-env” that was not found.
>>
>>     raise Fail("Configuration parameter '" + self.name + "' was not found in configurations
dictionary!")
>> resource_management.core.exceptions.Fail: Configuration parameter 'kafka-env' was
not found in configurations dictionary!
>>
>>
>> ERROR:  update or delete on table "servicecomponentdesiredstate" violates
>> foreign key constraint "hstcomponentstatecomponentname" on table
>> "hostcomponentstate"
>> DETAIL:  Key
>> (component_name,cluster_id,service_name)=(KAFKA_BROKER,2,KAFKA) is still
>> referenced from table "hostcomponentstate".
>> STATEMENT:  DELETE FROM servicecomponentdesiredstate WHERE
>> (((component_name = $1) AND (cluster_id = $2)) AND (service_name = $3))
>> ERROR:  current transaction is aborted, commands ignored until end of
>> transaction block
>> STATEMENT:  SELECT 1
>> ERROR:  duplicate key value violates unique constraint
>> "servicecomponentdesiredstate_pkey"
>> STATEMENT:  INSERT INTO servicecomponentdesiredstate (component_name,
>> desired_state, cluster_id, service_name, desired_stack_id) VALUES ($1, $2,
>> $3, $4, $5)
>> ERROR:  current transaction is aborted, commands ignored until end of
>> transaction block
>> STATEMENT:  SELECT 1
>> ERROR:  update or delete on table "clusterservices" violates foreign key
>> constraint "srvccmponentdesiredstatesrvcnm" on table
>> "servicecomponentdesiredstate"
>> DETAIL:  Key (service_name,cluster_id)=(KAFKA,2) is still referenced from
>> table "servicecomponentdesiredstate".
>> STATEMENT:  DELETE FROM clusterservices WHERE ((cluster_id = $1) AND
>> (service_name = $2))
>> ERROR:  current transaction is aborted, commands ignored until end of
>> transaction block
>> STATEMENT:  SELECT 1
>> ERROR:  duplicate key value violates unique constraint
>> "clusterservices_pkey"
>> STATEMENT:  INSERT INTO clusterservices (service_name, service_enabled,
>> cluster_id) VALUES ($1, $2, $3)
>> ERROR:  current transaction is aborted, commands ignored until end of
>> transaction block
>> STATEMENT:  SELECT 1
>> ERROR:  duplicate key value violates unique constraint
>> "servicecomponentdesiredstate_pkey"
>> STATEMENT:  INSERT INTO servicecomponentdesiredstate (component_name,
>> desired_state, cluster_id, service_name, desired_stack_id) VALUES ($1, $2,
>> $3, $4, $5)
>> ERROR:  current transaction is aborted, commands ignored until end of
>> transaction block
>> STATEMENT:  SELECT 1
>> LOG:  received SIGHUP, reloading configuration files
>>
>> We’ll let you know if we make more progress.
>>
>> Cheers
>> Ken
>>
>> From: <dbist13@gmail.com> on behalf of Artem Ervits <
>> artemervits@gmail.com>
>> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
>> Date: Friday, October 30, 2015 at 9:32 AM
>>
>> To: "user@ambari.apache.org" <user@ambari.apache.org>
>> Subject: Re: Any way to reset Ambari Install Wizard?
>>
>> I am guessing his issues are with ambari database, he's concerned to do
>> any kind of changes in the database directly. I'm trying to nail down the
>> issue to delete just that bad row. In that sense, upgrading ambari is not a
>> big deal. Resetting the database and creating a new cluster and import data
>> is a big deal. What I would do is take account of all the services he has
>> running. Once he knows what should be in Ambari and what shouldn't, go
>> through every table in the Ambari database and see if that service or any
>> reference to it exists. Purge that row and see where that takes you. I
>> personally had issues similar to that in Ambari as well with earlier
>> releases, 2.1.2 addressed many issues in the UI and in configuration.
>>
>> On Fri, Oct 30, 2015 at 11:51 AM, Adam Gover <agover@blackberry.com>
>> wrote:
>>
>>> Hi Artem,
>>>
>>> Valid Point.  I was surprised you suggest he update to 2.1.2 in the
>>> midst of this however. Doesn’t that increase the risk of further problems?
>>>
>>> Thanks
>>>
>>> Adam
>>>
>>>
>>> From: <dbist13@gmail.com> on behalf of Artem Ervits <
>>> artemervits@gmail.com>
>>> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
>>> Date: Friday, October 30, 2015 at 11:25 AM
>>>
>>> To: "user@ambari.apache.org" <user@ambari.apache.org>
>>> Subject: Re: Any way to reset Ambari Install Wizard?
>>>
>>> note that if a bad config is included in your json which may happen if
>>> you gather the configs, once you reset and reapply, it may come back and
>>> all these steps will be useless. We need to figure out what the issue is. I
>>> want him to avoid going the reset route until we exhaust every other option.
>>>
>>> On Fri, Oct 30, 2015 at 10:47 AM, Adam Gover <agover@blackberry.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> Hi There Ken,
>>>>
>>>> Lets try this again… now actually complete
>>>>
>>>> So I’ve been following along on this thread hoping someone would come
>>>> back with a better solution than the one I have.  Since I haven’t seen
any
>>>> Ill provide the details to my solution.
>>>>
>>>> Prereqs/comments:
>>>>
>>>>    - tested only on Ambari 2.1.2 – but should work on Ambari 2+ (also
>>>>    will work with some tweaks on 1.6, but won’t work on 1.7)
>>>>    - Tested using external postgres database but should also work with
>>>>    mysql
>>>>    - Test this on your own as it tends to have issues under some
>>>>    circumstances
>>>>
>>>>
>>>> I can’t provide the code I use to accomplish all this – but ill provide
>>>> an outline which should allow you to do the same thing.
>>>>
>>>> General info:
>>>> Base path for access to rest api is:
>>>> http://<ambari host>:8080/api/v1
>>>>
>>>> This can be accessed using a standard curl call similar to:
>>>> Curl –u admin:admin –H ‘X-Requested-By: ambari’ http://<ambari
>>>>  host>:8080/api/v1
>>>>
>>>> Ill indicate path to access info will just say “goto rest” and provide
>>>> additional path info (any options needed will need to be inserted before
>>>> the url).  Also note I’m in some cases copying parts of the scripts I’m
>>>> using so the values of the variables need to be populated.
>>>>
>>>>
>>>>    1. Backup all external databases (hive/oozie/ambari)
>>>>    2. Backup the filesystem after forcing a check point
>>>>    3. Before downing ambari collect a complete set of configs:
>>>>       1. Get list of all configs available
>>>>       Goto rest:
>>>>       clusters/${cluster_name}/?fields=Clusters/desired_configs
>>>>       2. Using the list retrieve ALL the json config files for the
>>>>       cluster
>>>>       Goto rest:
>>>>       http://${ambari_host}:8080/api/v1/clusters/${cluster_name}/configurations?(type=${config_type}&tag=${tag})
>>>>
>>>>
>>>>       So cluster_name=your defined cluster name,
>>>>       config_type=config_filename, tag=the most recent version of this config
>>>>       file (this is provided by the first rest call)
>>>>
>>>>       Note that the output here is NOT usable directly – you will need
>>>>       to slightly reformat these files prior to reimporting them
>>>>
>>>>
>    1.  Next shutdown ambari
>    2. On the command line as root execute “ambari-server reset”
>    3. Setup the base cluster name:
>    Goto rest: OPTION: -d '{"Clusters":{"version":"HDP-2.2"}}’
>     /clusters/${cluster_name}
>    4. For each host on the cluster – add it to the cluster
>    Goto rest: OPTION –X POST /clusters/${cluster_name}/hosts/${hostname}
>    5. Push ALL your configs captured in the part 1/3rd step to the
>    cluster via
>    May want to use this:
>    /var/lib/ambari-server/resources/scripts/configs.sh
>
>    NOTE I do this using perl – its basically a raw read that pushes using
>    (PUT) into
>    Goto rest: OPTION –X PUT /clusters/${cluster_name}
>    6. Next add each service & its associated components
>
>    To add service:
>    Goto rest: OPTION –X POST
>    /clusters/${cluster_name}/services/${service_name}
>
>    To add component:
>    Goto rest: OPTION –X POST
>    /clusters/${cluster_name}/services/${service_name}/components/${component}
>    7. Next for each host apply the required components using the follow 2
>    rest calls
>    Goto rest: OPTION –X POST
>    /clusters/${cluster_name}/hosts/${hostname}/host_components/${component}
>    Goto rest: OPTION –X PUT OPTION –d
>    '{"HostRoles":{"state":"INSTALLED"}}'
>    /clusters/${cluster_name}/hosts/${hostname}/host_components/${component}
>    8. Next set cluster status
>    Goto rest: OPTION –X POST OPTION –d '{"CLUSTER_CURRENT_STATUS":
>    "{"clusterState":"CLUSTER_STARTED_5"}"}’  /persist
>    9. Now set each service to an installed state
>    Goto rest: OPTION –X PUT OPTION -d
>    '{"ServiceInfo":{"state":"INSTALLED"}}’
>     /clusters/${cluster_name}/services/${service_name}
>    10. Finally set the cluster itself to INSTALLED – this (as far as I
>    know) is best done using SQL - I’m sure there is a rest call
>    but I haven’t found it yet
>    Update clusters set
>    provisioning_state=‘INSTALLED’,security_type=‘NONE’ where
>    cluster_name=${cluster_name}
>
>
> ALL of this can be automated and unless you have a 2 node cluster I would.
>
> NOTES – this process will work with HA & kerberized clusters but will need
> additional steps (especially for kerberos)
>
> Anyways I hope this helps – its complicated but doable and will save you
> copying/rebuilding which in larger clusters is really not doable.
>
> Cheers
> Adam
>
>
> From: Ken Barclay <kbarclay@ancestry.com>
> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
> Date: Friday, October 30, 2015 at 2:34 AM
> To: "user@ambari.apache.org" <user@ambari.apache.org>
> Subject: Re: Any way to reset Ambari Install Wizard?
>
> Hi Artem,
>
> I upgraded all Ambari components to 2.1.2, restarted everything, and after
> logging in, restarted all components where it was indicated.
>
> I tried the Add Service wizard for Kafka, and got to the page that allows
> me to assign masters and such, but clicking Next after that takes me to
> Customize Services, which gets stuck because the Next button on that page
> is never sensitized.  It just freezes there, saying it has recommended
> configurations, with the update icon spinning in the middle. All I can do
> is click Back at that point.
>
> Anything else I can try?
>
> Thanks
> Ken
>
> From: Artem Ervits <dbist13@gmail.com>
> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
> Date: Thursday, October 29, 2015 at 1:55 PM
> To: "user@ambari.apache.org" <user@ambari.apache.org>
> Subject: Re: Any way to reset Ambari Install Wizard?
>
> Please upgrade to latest 2.1.2 and restart all agents and Ambari server.
> Ctrl-shft-r on browser after you navigate to ambari URL. Login and let me
> know if it still shows same problem.
> On Oct 29, 2015 10:19 AM, "Ken Barclay" <kbarclay@ancestry.com> wrote:
>
>> Hi Artem,
>>
>> I started with 2.0.1, and upgraded it to 2.1 back in August.
>>
>> From: Artem Ervits <dbist13@gmail.com>
>> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
>> Date: Thursday, October 29, 2015 at 2:09 AM
>> To: "user@ambari.apache.org" <user@ambari.apache.org>
>> Subject: Re: Any way to reset Ambari Install Wizard?
>>
>> What version of Ambari are you running?
>> On Oct 27, 2015 6:51 PM, "Ken Barclay" <kbarclay@ancestry.com> wrote:
>>
>>> Hello,
>>>
>>> I’m returning to an issue we’ve left hanging since July – we have now to
>>> fix Ambari on this cluster or take the whole cluster down and reinstall
>>> from scratch.
>>>
>>> Our situation is that although our HDP 2.2 cluster is running well,
>>> Ambari cannot be used to install anything because the wizard is broken.
>>>
>>> I did a restart of Ambari server and agents per Artem, but without
>>> knowing exactly what changes to make to the postgres tables I’m reluctant
>>> to try that part. We also tried to add a new component (Spark) using the
>>> Ambari API instead of the wizard, but that also failed, as did trying to
>>> remove the Spark (again via the API) that had failed to install.
>>>
>>> We have 1.5T of monitoring data on this 4-node cluster that want to
>>> preserve. The cluster is dedicated to storing metrics in HBase via OpenTSDB
>>> and that is all it is used for.
>>>
>>> I just want to confirm with the group that since Ambari can only be used
>>> to manage a cluster that it installed itself, our best option in this
>>> scenario would be to:
>>>
>>> Shut down monitoring
>>> Copy all the data to another cluster
>>> Completely remove Ambari and HDP per
>>> https://cwiki.apache.org/confluence/display/AMBARI/Host+Cleanup+for+Ambari+and+Stack
>>> Do a fresh install of HDP 2.2 using the latest Ambari, and
>>> Copy the data back to the new cluster.
>>>
>>> Please let us know if this is a valid approach
>>> Thanks
>>>
>>> Ken
>>>
>>>
>>>
>>> From: <dbist13@gmail.com> on behalf of Artem Ervits <
>>> artemervits@gmail.com>
>>> Reply-To: "user@ambari.apache.org" <user@ambari.apache.org>
>>> Date: Tuesday, July 28, 2015 at 12:48 PM
>>> To: "user@ambari.apache.org" <user@ambari.apache.org>
>>> Subject: Re: Any way to reset Ambari Install Wizard?
>>>
>>> try to restart ambari server and agents, then stop and start services,
>>> sometimes services need to announce themselves to Ambari that they're
>>> installed. Always refer to the ambari-server log. Worst case scenario,
>>> delete Ambari_metrics service with API and clean up the postgres DB
>>> manually, tables to concentrate on are hostservicedesiredstate,
>>> servicedesiredstate etc. This should be last resort.
>>>
>>> On Tue, Jul 28, 2015 at 3:11 PM, Benoit Perroud <benoit@noisette.ch>
>>> wrote:
>>>
>>>> Some manual update in DB is most likely needed.
>>>>
>>>> *WARNING* use this at your own risk
>>>>
>>>> The table that needs to be updated is cluster_version.
>>>>
>>>> As far as I tested 2.1, it required less manual intervention than
>>>> 2.0.1. Upgrade has a retry button for most of the steps, and this is really
>>>> cool.
>>>>
>>>> Hope this help.
>>>>
>>>> Benoit
>>>>
>>>>
>>>>
>>>> 2015-07-28 20:01 GMT+02:00 Ken Barclay <kbarclay@ancestry.com>:
>>>>
>>>>> Hello,
>>>>>
>>>>> I upgraded a small test cluster from HDP 2.1 to HDP 2.2 and Ambari
>>>>> 2.0.1. In following the steps to replace Nagios + Ganglia with the Ambari
>>>>> Metrics System using the Ambari Wizard, an install failure occurred on
one
>>>>> node due to an outdated glibc library. I updated glibc and verified the
>>>>> Metrics packages could be installed, but couldn’t go back and finish
the
>>>>> installation through the wizard. The problem is: it flags some of the
>>>>> default settings, saying they need to be changed, but it skips past the
>>>>> screen very quickly that enables those settings to be changed, without
>>>>> allowing anything to be entered. So the button that allows you to proceed
>>>>> with the installation never becomes enabled.
>>>>>
>>>>> I subsequently manually finished the Metrics installation using the
>>>>> Ambari API and have it running in Distributed mode. But Ambari’s wizard
>>>>> cannot be used for anything now: the same problem described above
>>>>> occurs for every service I try to install.
>>>>>
>>>>> Can Ambari be reset somehow in this situation, or do I need to
>>>>> reinstall it?
>>>>> Or do you recommend installing 2.1?
>>>>>
>>>>> Thanks
>>>>> Ken
>>>>>
>>>>
>>>>
>>>
>
>
>

Mime
View raw message