ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JOAQUIN GUANTER GONZALBEZ <x...@tid.es>
Subject Re: Initializaton errors on Ambari 1.4.1
Date Thu, 06 Mar 2014 06:39:29 GMT
Hi Sid,

The exception only happens at initialization (so a few seconds after I
³ambari-server start²) and it actually kills the Ambari Server process (so
when I ³ambari-server status² it tells me the server is not running but
there¹s a stale pid file). So once we are able to start the server
successfully this doesn¹t repro anymore. I¹ll add that info to the JIRA
ticket.

Thanks,
Ximo.

El 05/03/14 23:22, "Siddharth Wagle" <swagle@hortonworks.com> escribió:

>Hi Ximo,
>
>Thanks for opening the Jira.
>
>You do not need to stop the ambari-server for the thread dump, you can use
>the following command to take a thread dump when the exception reoccurs.
>
>{JAVA_HOME}/bin/jstack -l <ambari-server-pid>
>
>Hope this helps.
>
>Best Regards,
>Sid
>
>
>On Tue, Mar 4, 2014 at 4:25 AM, JOAQUIN GUANTER GONZALBEZ
><ximo@tid.es>wrote:
>
>> Hi Siddarth,
>>
>> I have opened https://issues.apache.org/jira/browse/AMBARI-4930 to track
>> this issue. I cannot provide you with a thread dump because this
>>reproduces
>> in our production environment so we cannot stop or restart the
>> ambari-server without a maintenance period. I will try to reproduce in
>> another environment to get the thread dump and I will attach it to the
>>JIRA.
>>
>> Thanks!
>> Ximo
>>
>> De: Siddharth Wagle
>><swagle@hortonworks.com<mailto:swagle@hortonworks.com
>> >>
>> Responder a: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <
>> user@ambari.apache.org<mailto:user@ambari.apache.org>>
>> Fecha: viernes, 28 de febrero de 2014 19:05
>> Para: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <
>> user@ambari.apache.org<mailto:user@ambari.apache.org>>
>> Asunto: Re: Initializaton errors on Ambari 1.4.1
>>
>> Hi Ximo,
>>
>> Could you please provide the thread dump when you notice this? (jstack
>> command output)
>>
>> I would be glad to open a Jira for tracking this issue unless you want
>>to
>> open it that's fine too.
>>
>> The tables to look at for cleaning up the postgres db are,
>> execution_command and host_role_command which correspond to the requests
>>  and tasks.
>> You could just delete the byte array fields in these tables and reclaim
>> disk space by using the VACUUM command.
>>
>> Best Regards,
>> Sid
>>
>>
>> On Thu, Feb 27, 2014 at 10:06 PM, JOAQUIN GUANTER GONZALBEZ <ximo@tid.es
>> <mailto:ximo@tid.es>> wrote:
>> Hello,
>>
>> Since we upgraded to Ambari 1.4.1, we see the following initialization
>> error from time to time when trying to start ambari-server:
>>
>>
>> 04:44:56,972  INFO [main] Configuration:511 - Web App DIR test
>> /usr/lib/ambari-server/web
>>
>> 04:44:56,975  INFO [main] CertificateManager:70 - Initialization of root
>> certificate
>>
>> 04:44:56,975  INFO [main] CertificateManager:72 - Certificate
>>exists:true
>>
>> 04:44:57,003  INFO [main] AmbariServer:338 - ********* Initializing
>> Clusters **********
>>
>> 04:44:57,285  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
>> host andromeda-compute02.hi.inet
>>
>> 04:44:57,295  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
>> host andromeda-compute03.hi.inet
>>
>> 04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
>> host andromeda-compute06.hi.inet
>>
>> 04:44:57,296  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
>> host andromeda-compute04.hi.inet
>>
>> 04:44:57,297  WARN [Thread-2] HeartbeatMonitor:123 - Heartbeat lost from
>> host andromeda-data99.hi.inet
>>
>> 04:44:57,318 ERROR [main] AmbariServer:461 - Failed to run the Ambari
>> Server
>>
>> Local Exception Stack:
>>
>> Exception [EclipseLink-2004] (Eclipse Persistence Services -
>> 2.4.0.v20120608-r11652):
>> org.eclipse.persistence.exceptions.ConcurrencyException
>>
>> Exception Description: A signal was attempted before wait() on
>> ConcurrencyManager. This normally means that an attempt was made to
>>
>> commit or rollback a transaction before it was started, or to rollback a
>> transaction twice.
>>
>>         at
>>
>>org.eclipse.persistence.exceptions.ConcurrencyException.signalAttemptedBe
>>foreWait(ConcurrencyException.java:84)
>>
>>         at
>>
>>org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseReadLoc
>>k(ConcurrencyManager.java:489)
>>
>>         at
>>
>>org.eclipse.persistence.internal.identitymaps.CacheKey.releaseReadLock(Ca
>>cheKey.java:392)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegister
>>Object(UnitOfWorkImpl.java:1022)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegister
>>Object(UnitOfWorkImpl.java:933)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.g
>>etAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:193)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.g
>>etFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:121)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExisting
>>Object(UnitOfWorkImpl.java:3906)
>>
>>         at
>>
>>org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExisting
>>Object(UnitOfWorkImpl.java:3861)
>>
>>         at
>>
>>org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWork
>>Clone(CollectionMapping.java:296)
>>
>>         at
>>
>>org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(Coll
>>ectionMapping.java:309)
>>
>>         at
>>
>>org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFrom
>>IteratorInto(ContainerPolicy.java:214)
>>
>>         at
>>
>>org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObjec
>>t(CollectionMapping.java:222)
>>
>>         at
>>
>>org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.b
>>uildCloneFor(UnitOfWorkQueryValueHolder.java:56)
>>
>>         at
>>
>>org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instan
>>tiateImpl(UnitOfWorkValueHolder.java:161)
>>
>>         at
>>
>>org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instan
>>tiate(UnitOfWorkValueHolder.java:222)
>>
>>         at
>>
>>org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue
>>(DatabaseValueHolder.java:88)
>>
>>         at
>>
>>org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectLi
>>st.java:244)
>>
>>         at
>>
>>org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList
>>.java:415)
>>
>>         at
>>
>>org.eclipse.persistence.indirection.IndirectList.isEmpty(IndirectList.jav
>>a:490)
>>
>>         at
>> org.apache.ambari.server.state.ServiceImpl.<init>(ServiceImpl.java:125)
>>
>>         at
>>
>>org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e.<in
>>it>(<generated>)
>>
>>         at
>>
>>org.apache.ambari.server.state.ServiceImpl$$EnhancerByGuice$$807a405e$$Fa
>>stClassByGuice$$1c1221ad.newInstance(<generated>)
>>
>>         at
>>
>>com.google.inject.internal.cglib.reflect.$FastConstructor.newInstance(Fas
>>tConstructor.java:40)
>>
>>         at
>>
>>com.google.inject.internal.ProxyFactory$ProxyConstructor.newInstance(Prox
>>yFactory.java:260)
>>
>>         at
>>
>>com.google.inject.internal.ConstructorInjector.construct(ConstructorInjec
>>tor.java:85)
>>
>>         at
>>
>>com.google.inject.internal.ConstructorBindingImpl$Factory.get(Constructor
>>BindingImpl.java:254)
>>
>>         at
>> com.google.inject.internal.InjectorImpl$4$1.call(InjectorImpl.java:978)
>>
>>         at
>>
>>com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1
>>024)
>>
>>         at
>> com.google.inject.internal.InjectorImpl$4.get(InjectorImpl.java:974)
>>
>>         at
>>
>>com.google.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2
>>.java:632)
>>
>>         at $Proxy12.createExisting(Unknown Source)
>>
>>         at
>>
>>org.apache.ambari.server.state.cluster.ClusterImpl.loadServices(ClusterIm
>>pl.java:218)
>>
>>         at
>>
>>org.apache.ambari.server.state.cluster.ClusterImpl.debugDump(ClusterImpl.
>>java:808)
>>
>>         at
>>
>>org.apache.ambari.server.state.cluster.ClustersImpl.debugDump(ClustersImp
>>l.java:566)
>>
>>         at
>>
>>org.apache.ambari.server.controller.AmbariServer.run(AmbariServer.java:34
>>1)
>>
>>         at
>>
>>org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:4
>>58)
>>
>> Is this a known issue? It seems to be related with the amount of data in
>> the PostgreSQL DB. In one of our environments, the PSQL DB dump's size
>>is
>> around 1 GB and we are having serious problems to launch ambari-server
>> (around 60-70% of the "ambari-server start" commands cause the above
>> exception).
>>
>> Thanks,
>> Ximo.
>>
>> ________________________________
>>
>> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
>> nuestra política de envío y recepción de correo electrónico en el enlace
>> situado más abajo.
>> This message is intended exclusively for its addressee. We only send and
>> receive email on the basis of the terms set out at:
>> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is
>>confidential,
>> privileged and exempt from disclosure under applicable law. If the
>>reader
>> of this message is not the intended recipient, you are hereby notified
>>that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>>immediately
>> and delete it from your system. Thank You.
>>
>> ________________________________
>>
>> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
>> nuestra política de envío y recepción de correo electrónico en el enlace
>> situado más abajo.
>> This message is intended exclusively for its addressee. We only send and
>> receive email on the basis of the terms set out at:
>> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>>
>
>--
>CONFIDENTIALITY NOTICE
>NOTICE: This message is intended for the use of the individual or entity
>to
>which it is addressed and may contain information that is confidential,
>privileged and exempt from disclosure under applicable law. If the reader
>of this message is not the intended recipient, you are hereby notified
>that
>any printing, copying, dissemination, distribution, disclosure or
>forwarding of this communication is strictly prohibited. If you have
>received this communication in error, please contact the sender
>immediately
>and delete it from your system. Thank You.


________________________________

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política
de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on
the basis of the terms set out at:
http://www.tid.es/ES/PAGINAS/disclaimer.aspx

Mime
View raw message