ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JOAQUIN GUANTER GONZALBEZ <x...@tid.es>
Subject Re: Initializaton errors on Ambari 1.4.1
Date Tue, 04 Mar 2014 12:25:08 GMT
Hi Siddarth,

I have opened https://issues.apache.org/jira/browse/AMBARI-4930 to track this issue. I cannot
provide you with a thread dump because this reproduces in our production environment so we
cannot stop or restart the ambari-server without a maintenance period. I will try to reproduce
in another environment to get the thread dump and I will attach it to the JIRA.

Thanks!
Ximo

De: Siddharth Wagle <swagle@hortonworks.com<mailto:swagle@hortonworks.com>>
Responder a: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<mailto:user@ambari.apache.org>>
Fecha: viernes, 28 de febrero de 2014 19:05
Para: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<mailto:user@ambari.apache.org>>
Asunto: Re: Initializaton errors on Ambari 1.4.1

Hi Ximo,

Could you please provide the thread dump when you notice this? (jstack command output)

I would be glad to open a Jira for tracking this issue unless you want to open it that's fine
too.

The tables to look at for cleaning up the postgres db are, execution_command and host_role_command
which correspond to the requests  and tasks.
You could just delete the byte array fields in these tables and reclaim disk space by using
the VACUUM command.

Best Regards,
Sid


On Thu, Feb 27, 2014 at 10:06 PM, JOAQUIN GUANTER GONZALBEZ <ximo@tid.es<mailto:ximo@tid.es>>
wrote:
Hello,

Since we upgraded to Ambari 1.4.1, we see the following initialization error from time to
time when trying to start ambari-server:


04:44:56,972  INFO [main] Configuration:511 - Web App DIR test /usr/lib/ambari-server/web

04:44:56,975  INFO [main] CertificateManager:70 - Initialization of root certificate

04:44:56,975  INFO [main] CertificateManager:72 - Certificate exists:true

04:44:57,003  INFO [main] AmbariServer:338 - ********* Initializing Clusters **********

04:44:57,285  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from host andromeda-compute02.hi.inet

04:44:57,295  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from host andromeda-compute03.hi.inet

04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from host andromeda-compute06.hi.inet

04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from host andromeda-compute04.hi.inet

04:44:57,297  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from host andromeda-data99.hi.inet

04:44:57,318 ERROR [main] AmbariServer:461 - Failed to run the Ambari Server

Local Exception Stack:

Exception [EclipseLink-2004] (Eclipse Persistence Services - 2.4.0.v20120608-r11652): org.eclipse.persistence.exceptions.ConcurrencyException

Exception Description: A signal was attempted before wait() on ConcurrencyManager. This normally
means that an attempt was made to

commit or rollback a transaction before it was started, or to rollback a transaction twice.

        at org.eclipse.persistence.exceptions.ConcurrencyException.signalAttemptedBeforeWait(ConcurrencyException.java:84)

        at org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseReadLock(ConcurrencyManager.java:489)

        at org.eclipse.persistence.internal.identitymaps.CacheKey.releaseReadLock(CacheKey.java:392)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:1022)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:933)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:193)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:121)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3906)

        at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3861)

        at org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWorkClone(CollectionMapping.java:296)

        at org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(CollectionMapping.java:309)

        at org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFromIteratorInto(ContainerPolicy.java:214)

        at org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObject(CollectionMapping.java:222)

        at org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.buildCloneFor(UnitOfWorkQueryValueHolder.java:56)

        at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiateImpl(UnitOfWorkValueHolder.java:161)

        at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiate(UnitOfWorkValueHolder.java:222)

        at org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue(DatabaseValueHolder.java:88)

        at org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectList.java:244)

        at org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList.java:415)

        at org.eclipse.persistence.indirection.IndirectList.isEmpty(IndirectList.java:490)

        at org.apache.ambari.server.state.ServiceImpl.<init>(ServiceImpl.java:125)

        at org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e.<init>(<generated>)

        at org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e$$FastClassByGuice$$1c1221ad.newInstance(<generated>)

        at com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)

        at com.google.inject.internal.ProxyFactory$ProxyConstructor.newInstance(ProxyFactory.java:260)

        at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)

        at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:254)

        at com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978)

        at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1024)

        at com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974)

        at com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:632)

        at $Proxy12.createExisting(Unknown Source)

        at org.apache.ambari.server.state.cluster.ClusterImpl.loadServices(ClusterImpl.java:218)

        at org.apache.ambari.server.state.cluster.ClusterImpl.debugDump(ClusterImpl.java:808)

        at org.apache.ambari.server.state.cluster.ClustersImpl.debugDump(ClustersImpl.java:566)

        at org.apache.ambari.server.controller.AmbariServer.run(AmbariServer.java:341)

        at org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:458)

Is this a known issue? It seems to be related with the amount of data in the PostgreSQL DB.
In one of our environments, the PSQL DB dump’s size is around 1 GB and we are having serious
problems to launch ambari-server (around 60-70% of the “ambari-server start” commands
cause the above exception).

Thanks,
Ximo.

________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política
de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on
the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.

________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política
de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on
the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Mime
View raw message