ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (AMBARI-6702) Ambari detects RPM DB corruption
Date Tue, 12 Aug 2014 11:15:11 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Onischuk reassigned AMBARI-6702:
---------------------------------------

    Assignee: Andrew Onischuk  (was: Alejandro Fernandez)

> Ambari detects RPM DB corruption
> --------------------------------
>
>                 Key: AMBARI-6702
>                 URL: https://issues.apache.org/jira/browse/AMBARI-6702
>             Project: Ambari
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Alejandro Fernandez
>            Assignee: Andrew Onischuk
>             Fix For: 1.7.0
>
>
> Users have described scenarios in which the RPM DB becomes corrupt, usually after stoping
all services, rebooting all hosts (including the server), and restarting all services.
> http://hortonworks.com/community/forums/topic/cant-restart-cluster-ambari-not-proving-useful/
> http://hortonworks.com/community/forums/topic/ambari-corrupts-rpmdb/
> * Problem: yum commands fail to run because the RPM database is corrupt.
> * Symptom: The ambari agent log will show something of the sort,
> {code}
> INFO 2014-04-24 05:30:11,051 Controller.py:186 - RegistrationCommand received - repeat
agent registration
> ERROR 2014-04-24 05:33:22,669 PackagesAnalyzer.py:43 - Task timed out and will be killed
> INFO 2014-04-24 05:35:12,815 HostCheckReportFileHandler.py:43 - Host check report at
/var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:104 - Removing old host check
file at /var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:109 - Creating host check
file at /var/lib/ambari-agent/data/hostcheck.result
> root@xhadoopm32p rpm# rpm -qa
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run database
recovery
> error: cannot open Packages index using db3 - (-30974)
> error: cannot open Packages database in /var/lib/rpm
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run database
recovery
> error: cannot open Packages database in /var/lib/rpm
> {code}
> * Fix:
> Run the following
> {code}
> rm /var/lib/rpm/__db*
> yum --rebuilddb
> {code}
> This appears to be an underlying issue with yum (either a lock is not released, or multiple
yum commands are ran in parallel), so to attempt to decrease its frequency, the agent's PackagesAnalyzer
will increase the time it waits for the "yum list available" and "yum list installed" from
10 secs to 20 secs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message