incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <matthieu.mo...@gmail.com>
Subject Re: No ClusterNode exists for partition 1 (Deploying S4 app with Yarn)
Date Fri, 09 Nov 2012 11:17:54 GMT
On Fri, Nov 9, 2012 at 10:48 AM, Frank Zheng <bearzheng2011@gmail.com> wrote:
> Dear All,
>
> I followed the instruction in the "Deploying S4 applications with Yarn" step
> by step, to run the Twitter Example with Yarn.
> I succeed to set up 2 active nodes.

>From the status info, it looks like you have 2 partitions for the
"counter" logical cluster, but only 1 node started.

This might  be due to a memory allocation issue: YARN may not have
enough resources to start a second node. Can you check the
"yarn.scheduler.minimum-allocation-mb" as commented in the wiki page?

Thanks,

Matthieu


> Here is the status.
>
> App Status
> ----------------------------------------------------------------------------------------------------------------------------------
>         Name              Cluster
> URI
> ----------------------------------------------------------------------------------------------------------------------------------
>       adapter             adapter
> hdfs://testing.machine1:8020/twitter-adapter.s4r
>       counter             counter
> hdfs://testing.machine1:8020/twitter-counter.s4r
> ----------------------------------------------------------------------------------------------------------------------------------
>
>
>
> Cluster Status
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Active nodes
>         Name                App           Tasks
> --------------------------------------------------------------------------------
>                                                    Number    Task id
> Host                         Port
> ----------------------------------------------------------------------------------------------------------------------------------
>       adapter             adapter          1         1        Task-0
> testing.machine1                   13000
>       counter             counter          2         1        Task-0
> testing.machine1                   12000
> ----------------------------------------------------------------------------------------------------------------------------------
>
>
>
> Stream Status
> ----------------------------------------------------------------------------------------------------------------------------------
>         Name                               Producers
> Consumers
> ----------------------------------------------------------------------------------------------------------------------------------
>      RawStatus                         adapter(adapter)
> counter(counter)
> ----------------------------------------------------------------------------------------------------------------------------------
>
>
> And I check the log of counter application. Here are some errors "No
> ClusterNode exists for partition 1".
>
> 17:31:36.405 [main] INFO  org.apache.s4.core.Main - Initializing S4 node
> with :
> - comm module class [org.apache.s4.comm.DefaultCommModule]
> - comm configuration file [default.s4.comm.properties from classpath]
>
> - core module class [org.apache.s4.core.DefaultCoreModule]
> - core configuration file[default.s4.core.properties from classpath]
> - extra modules: [org.apache.s4.deploy.HdfsFetcherModule,
> org.apache.s4.deploy.HdfsFetcherModule]
>
> - inline parameters: []
> 17:31:36.419 [main] DEBUG org.apache.s4.core.Main - Adding named parameters
> for injection : [s4.cluster.zk_address=testing.machine1:2181]
> 17:31:36.843 [main] INFO  org.apache.s4.core.Main - Starting S4 node. This
> node will automatically download applications published for the cluster it
> belongs to
>
> 17:31:36.968 [main] INFO  o.a.s.comm.topology.AssignmentFromZK - New
> session:88640044914376712; state is : SyncConnected
> 17:31:37.044 [main] INFO  o.a.s.comm.topology.AssignmentFromZK -
> Successfully acquired task:Task-0 by testing.machine1
>
> 17:31:37.055 [main] INFO  org.apache.s4.deploy.HdfsS4RFetcher - Fetching S4R
> through hdfs from uri hdfs://testing.machine1:8020/twitter-counter.s4r
> 17:31:38.112 [main] INFO  org.apache.s4.core.Server - Loading application
> [counter] from file [/tmp/tmp4608799767876669855s4r]
>
> 17:31:38.113 [main] WARN  o.a.s4.base.util.S4RLoaderFactory - s4.tmp.dir not
> specified, using temporary directory [/tmp/1352539898113-0] for unpacking
> S4R. You may want to specify a parent non-temporary directory.
> 17:31:38.113 [main] INFO  o.a.s4.base.util.S4RLoaderFactory - Unzipping S4R
> archive in [/tmp/1352539898113-0]
>
> 17:31:38.216 [main] INFO  org.apache.s4.core.Server - App class name is:
> org.apache.s4.example.twitter.TwitterCounterApp
> 17:31:38.255 [main] INFO  o.a.s4.comm.topology.ClusterFromZK - Changing
> cluster topology to {
> nbNodes=1,name=counter,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=testing.machine1,taskId=Task-0}]}
> from null
>
> 17:31:38.287 [main] INFO  o.a.s4.comm.topology.ClusterFromZK - Adding
> topology change listener:org.apache.s4.comm.tcp.TCPEmitter@420a6d35
> 17:31:38.317 [main] INFO  o.a.s4.comm.topology.ClustersFromZK - New
> session:88640044914376714
>
> 17:31:38.326 [main] INFO  o.a.s4.comm.topology.ClustersFromZK - New
> session:88640044914376715
> 17:31:38.332 [main] INFO  o.a.s4.comm.topology.ClusterFromZK - Changing
> cluster topology to {
> nbNodes=1,name=counter,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=testing.machine1,taskId=Task-0}]}
> from null
>
> 17:31:38.332 [main] INFO  org.apache.s4.core.Server - Loaded application
> from file /tmp/tmp4608799767876669855s4r
> 17:31:38.332 [main] INFO  o.a.s.d.DistributedDeploymentManager -
> Successfully installed application counter
>
> 17:31:38.350 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping
> generated overload dispatcher class for PE of class [class
> org.apache.s4.example.twitter.TopNTopicPE]
> 17:31:38.370 [main] INFO  o.a.s4.example.twitter.TopNTopicPE - key: []
>
> 17:31:38.375 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping
> generated overload dispatcher class for PE of class [class
> org.apache.s4.example.twitter.TopicCountAndReportPE]
> 17:31:38.377 [main] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping
> generated overload dispatcher class for PE of class [class
> org.apache.s4.example.twitter.TopicExtractorPE]
>
> 17:31:38.377 [main] DEBUG o.a.s4.comm.topology.ClustersFromZK - Adding input
> stream [RawStatus] for app [-1] in cluster [counter]
> 17:31:38.426 [main] INFO  o.a.s4.comm.topology.ClustersFromZK - Detected new
> stream [RawStatus]
>
> 17:31:38.433 [main] INFO  org.apache.s4.core.App - Init prototype
> [org.apache.s4.example.twitter.TopNTopicPE].
> 17:31:38.435 [main] DEBUG org.apache.s4.core.ProcessingElement - Started
> timer for PE prototype [org.apache.s4.example.twitter.TopNTopicPE], ID []
> with interval [10000].
>
> 17:31:38.436 [main] DEBUG org.apache.s4.core.ProcessingElement - Started
> checkpointing timer for PE prototype
> [org.apache.s4.example.twitter.TopNTopicPE], ID [] with interval [20]
> [SECONDS].
> 17:31:38.437 [main] INFO  org.apache.s4.core.App - Init prototype
> [org.apache.s4.example.twitter.TopicCountAndReportPE].
>
> 17:31:38.438 [main] DEBUG org.apache.s4.core.ProcessingElement - Started
> timer for PE prototype
> [org.apache.s4.example.twitter.TopicCountAndReportPE], ID [] with interval
> [10000].
> 17:31:38.439 [main] INFO  org.apache.s4.core.App - Init prototype
> [org.apache.s4.example.twitter.TopicExtractorPE].
>
> 17:32:50.279 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:50.395 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [ootd]
> 17:32:52.832 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [劣化コピー]
>
> 17:32:52.998 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:52.999 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [e-cigarette]
> 17:32:53.645 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [queseponecelosinelpelele]
>
> 17:32:54.002 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [sisterhood]
> 17:32:55.213 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:55.217 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:32:55.222 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [FF]
> 17:32:55.223 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:55.985 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [يارب]
>
> 17:32:55.999 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:56.217 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:56.225 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:32:56.448 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [fuckyeah]
> 17:32:57.991 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:58.222 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:32:58.237 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:59.230 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [Obama]
> 17:32:59.230 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [USA2012]
>
> 17:32:59.312 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:59.312 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:32:59.991 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:01.075 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [inadan_ranking]
> 17:33:01.976 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:01.989 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:03.242 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:03.652 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:03.652 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [ShowSomeLov]
>
> 17:33:05.243 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [New]
> 17:33:05.243 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [4]
> 17:33:05.435 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [fotoojosentrecerrados]
>
> 17:33:05.978 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [tcot]
> 17:33:06.001 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:06.220 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [KAAAAAAAAAAK]
>
> 17:33:07.078 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [love]
> 17:33:07.078 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [chri]
> 17:33:09.023 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:09.195 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [Savile's]
> 17:33:09.232 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [askreesh]
> 17:33:09.976 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:10.011 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [FF_Special]
> 17:33:10.030 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [cooltattos]
> 17:33:10.245 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [setujuGAK?]
>
> 17:33:10.261 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [tr]
> 17:33:10.988 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [YMCMB]
> 17:33:11.335 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:11.336 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [hatchiapp]
> 17:33:13.992 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:15.211 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:15.212 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [trivandrum.]
> 17:33:15.212 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [kerala]
> 17:33:15.261 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [EMAs!!]
>
> 17:33:15.261 [TopicSeen] INFO  o.a.s.e.t.TopicCountAndReportPE - Handling
> new topic [TEAMBIEBER]
> 17:33:17.256 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
> 17:33:17.986 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
> 17:33:18.216 [RawStatus] ERROR org.apache.s4.comm.tcp.TCPEmitter - No
> ClusterNode exists for partitionId 1
>
>
> Could anyont tell me why and how can I solve this?
>
> Thanks!
> Yu
>

Mime
View raw message