incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide Simoncelli <Davide.Simonce...@neclab.eu>
Subject RE: Zookeeper problem
Date Thu, 06 Sep 2012 07:00:17 GMT
Hi,

does it happen on adapter node?

I have experimented this behavior when the application starts and not after a period of time
(I suppose the usage of JVM memory grows after a while).

Regards

- Davide

On Wednesday, September 05, 2012 06:25:43 PM Flavio Junqueira wrote:
> Yes, GC is a candidate.
> 
> -Flavio
> 
> On Sep 5, 2012, at 5:51 PM, Aimee Cheng wrote:
> > Hi,
> > 
> > We used to meet such case: the application is still running, but the usage
> > of JVM memory is very high, and that makes long time full GC in this
> > application, so other threads(e.g zkClient) hung. If the time of Full GC
> > is longer than the session timeout, Zookeeper will consider this session
> > expired.
> > 
> > Maybe you can check you adapter application.
> > 
> > Hope this helps.
> > 
> > -Aimee
> > 
> > On Sep 5, 2012, at 9:52 PM, Davide Simoncelli wrote:
> >> Hello,
> >> 
> >> I'm trying to running an application on a cluster with 10 nodes. There is
> >> also an adapter cluster with only one nodes. What I noticed is that the
> >> node in the adapter cluster sends events and the node on it is running
> >> (the top command shows that the java process is using the CPU). The
> >> other 10 nodes (all of them) don't receive anything and the java process
> >> on each node doesn't even use the CPU. After a while the following
> >> exception is thrown:
> >> 
> >> [ZkClient-EventThread-27-localhost:2181] ERROR
> >> o.a.s4.comm.topology.ClustersFromZK - Zookeeper session expired,
> >> possibly due to a network partition for cluster [cluster1_adapter]. This
> >> node is considered as dead by Zookeeper. Proceeding to stop this node.
> >> 
> >> There is no error when clusters are created and nodes are started. Also
> >> the status command shows the following output that let me to assume
> >> everything is ok: App Status
> >> -------------------------------------------------------------------------
> >> --------------------------------------------------------->> 
> >>       Name              Cluster                                          
> >>              URI>> 
> >> -------------------------------------------------------------------------
> >> --------------------------------------------------------- testAppAdapter 
> >>   cluster1_adapter 
> >> file:/home/s4-piper/testApp/build/libs/testAppAdapter.s4r>> 
> >>    testApp                 cluster1      file:/tmp/testApp.s4r
> >> 
> >> -------------------------------------------------------------------------
> >> ---------------------------------------------------------
> >> 
> >> 
> >> Cluster Status
> >> -------------------------------------------------------------------------
> >> --------------------------------------------------------->> 
> >>                                                                            
      A
> >>                                                                            
      c
> >>                                                                            
      t
> >>                                                                            
      i
> >>                                                                            
      v
> >>                                                                            
      e
> >>                                                                            
      
> >>                                                                            
      n
> >>                                                                            
      o
> >>                                                                            
      d
> >>                                                                            
      e
> >>                                                                            
      s
> >>       
> >>       Name                App           Tasks  
> >>       ------------------------------------------------------------------
> >>       -------------->>       
> >>                                                  Number    Task id       
> >>                                                                   Host  
> >>                                                                       
> >>                                                  Port
> >> 
> >> -------------------------------------------------------------------------
> >> ---------------------------------------------------------
> >> cluster1_adapter   testAppAdapter    1         1        Task-0          
> >>        computer1                   13000>> 
> >>     cluster1           testApp                 10        10       Task-6 
> >>                     computer2                   12006>>     
> >>                                                             Task-7       
> >>                                                                      
> >>                                                             computer4   
> >>                                                                         
> >>                                                               12007
> >>                                                             Task-4       
> >>                                                                      
> >>                                                             computer6   
> >>                                                                         
> >>                                                               12004
> >>                                                             Task-5       
> >>                                                                      
> >>                                                             computer7   
> >>                                                                         
> >>                                                               12005
> >>                                                             Task-2       
> >>                                                                      
> >>                                                             computer9   
> >>                                                                         
> >>                                                               12002
> >>                                                             Task-3       
> >>                                                                      
> >>                                                             computer11  
> >>                                                                         
> >>                                                               12003
> >>                                                             Task-0       
> >>                                                                      
> >>                                                             computer17  
> >>                                                                         
> >>                                                               12000
> >>                                                             Task-1       
> >>                                                                      
> >>                                                             computer18  
> >>                                                                         
> >>                                                               12001
> >>                                                             Task-9       
> >>                                                                      
> >>                                                             computer23  
> >>                                                                         
> >>                                                               12009
> >>                                                             Task-8       
> >>                                                                      
> >>                                                             computer37  
> >>                                                                         
> >>                                                               12008
> >> 
> >> -------------------------------------------------------------------------
> >> ---------------------------------------------------------
> >> 
> >> 
> >> 
> >> Stream Status
> >> -------------------------------------------------------------------------
> >> --------------------------------------------------------->> 
> >>       Name                               Producers                       
> >>                             Consumers>> 
> >> -------------------------------------------------------------------------
> >> --------------------------------------------------------- RawlData       
> >>                      cluster1_adapter(testAppAdapter)                   
> >>         cluster1(testApp)
> >> ------------------------------------------------------------------------
> >> ----------------------------------------------------------
> >> 
> >> Could you help me?
> >> 
> >> Thank you
> >> 
> >> Regards
> >> 
> >> - Davide

Mime
View raw message