ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddharth Wagle <swa...@hortonworks.com>
Subject Re: Initializaton errors on Ambari 1.4.1
Date Fri, 28 Feb 2014 18:05:31 GMT
Hi Ximo,

Could you please provide the thread dump when you notice this? (jstack
command output)

I would be glad to open a Jira for tracking this issue unless you want to
open it that's fine too.

The tables to look at for cleaning up the postgres db are,
execution_command and host_role_command which correspond to the requests
and tasks.
You could just delete the byte array fields in these tables and reclaim
disk space by using the VACUUM command.

Best Regards,
Sid


On Thu, Feb 27, 2014 at 10:06 PM, JOAQUIN GUANTER GONZALBEZ <ximo@tid.es>wrote:

>  Hello,
>
>  Since we upgraded to Ambari 1.4.1, we see the following initialization
> error from time to time when trying to start ambari-server:
>
>   04:44:56,972  INFO [main] Configuration:511 - Web App DIR test
> /usr/lib/ambari-server/web
>
> 04:44:56,975  INFO [main] CertificateManager:70 - Initialization of root
> certificate
>
> 04:44:56,975  INFO [main] CertificateManager:72 - Certificate exists:true
>
> 04:44:57,003  INFO [main] AmbariServer:338 - ********* Initializing
> Clusters **********
>
> 04:44:57,285  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
> host andromeda-compute02.hi.inet
>
> 04:44:57,295  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
> host andromeda-compute03.hi.inet
>
> 04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
> host andromeda-compute06.hi.inet
>
> 04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
> host andromeda-compute04.hi.inet
>
> 04:44:57,297  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
> host andromeda-data99.hi.inet
>
> 04:44:57,318 ERROR [main] AmbariServer:461 - Failed to run the Ambari
> Server
>
> Local Exception Stack:
>
> Exception [EclipseLink-2004] (Eclipse Persistence Services -
> 2.4.0.v20120608-r11652):
> org.eclipse.persistence.exceptions.ConcurrencyException
>
> Exception Description: A signal was attempted before wait() on
> ConcurrencyManager. This normally means that an attempt was made to
>
> commit or rollback a transaction before it was started, or to rollback a
> transaction twice.
>
>         at
> org.eclipse.persistence.exceptions.ConcurrencyException.signalAttemptedBeforeWait(ConcurrencyException.java:84)
>
>         at
> org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseReadLock(ConcurrencyManager.java:489)
>
>         at
> org.eclipse.persistence.internal.identitymaps.CacheKey.releaseReadLock(CacheKey.java:392)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:1022)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:933)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:193)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:121)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3906)
>
>         at
> org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3861)
>
>         at
> org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWorkClone(CollectionMapping.java:296)
>
>         at
> org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(CollectionMapping.java:309)
>
>         at
> org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFromIteratorInto(ContainerPolicy.java:214)
>
>         at
> org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObject(CollectionMapping.java:222)
>
>         at
> org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.buildCloneFor(UnitOfWorkQueryValueHolder.java:56)
>
>         at
> org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiateImpl(UnitOfWorkValueHolder.java:161)
>
>         at
> org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiate(UnitOfWorkValueHolder.java:222)
>
>         at
> org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue(DatabaseValueHolder.java:88)
>
>         at
> org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectList.java:244)
>
>         at
> org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList.java:415)
>
>         at
> org.eclipse.persistence.indirection.IndirectList.isEmpty(IndirectList.java:490)
>
>         at
> org.apache.ambari.server.state.ServiceImpl.<init>(ServiceImpl.java:125)
>
>         at
> org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e.<init>(<generated>)
>
>         at
> org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e$$FastClassByGuice$$1c1221ad.newInstance(<generated>)
>
>         at
> com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(FastConstructor.java:40)
>
>         at
> com.google.inject.internal.ProxyFactory$ProxyConstructor.newInstance(ProxyFactory.java:260)
>
>         at
> com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:85)
>
>         at
> com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:254)
>
>         at
> com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978)
>
>         at
> com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1024)
>
>         at
> com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974)
>
>         at
> com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:632)
>
>         at $Proxy12.createExisting(Unknown Source)
>
>         at
> org.apache.ambari.server.state.cluster.ClusterImpl.loadServices(ClusterImpl.java:218)
>
>         at
> org.apache.ambari.server.state.cluster.ClusterImpl.debugDump(ClusterImpl.java:808)
>
>         at
> org.apache.ambari.server.state.cluster.ClustersImpl.debugDump(ClustersImpl.java:566)
>
>         at
> org.apache.ambari.server.controller.AmbariServer.run(AmbariServer.java:341)
>
>         at
> org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:458)
>
>  Is this a known issue? It seems to be related with the amount of data in
> the PostgreSQL DB. In one of our environments, the PSQL DB dump's size is
> around 1 GB and we are having serious problems to launch ambari-server
> (around 60-70% of the "ambari-server start" commands cause the above
> exception).
>
>  Thanks,
> Ximo.
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message