openwhisk-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-openwhisk] axiqia commented on issue #4303: Unable to connect to controller0 while running openwhisk.yml (local deployment), Connection refused!
Date Tue, 04 Jun 2019 13:44:02 GMT
axiqia commented on issue #4303: Unable to connect to controller0 while running openwhisk.yml
(local deployment), Connection refused!
URL: https://github.com/apache/incubator-openwhisk/issues/4303#issuecomment-498677541
 
 
   Dear all,
   I met the issue again! 
   
   When I ran openwhisk.yml, it failed by `Connection refused`.
   ```
   AILED - RETRYING: wait until the Controller in this host is up and running (4 retries left).
   FAILED - RETRYING: wait until the Controller in this host is up and running (3 retries
left).
   FAILED - RETRYING: wait until the Controller in this host is up and running (2 retries
left).
   FAILED - RETRYING: wait until the Controller in this host is up and running (1 retries
left).
   fatal: [controller0]: FAILED! => {"attempts": 12, "changed": false, "content": "", "msg":
"Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection
refused>", "redirected": false, "status": -1, "url": "https://172.17.0.1:10001/ping"}
   
   Status code was -1 and not [200]: Request failed: <urlopen error [Errno
   111] Connection refused>
   
   PLAY RECAP ****************************************************************************************************************************************************************************
   controller0                : ok=25   changed=7    unreachable=0    failed=1
   kafka0                     : ok=11   changed=5    unreachable=0    failed=0
   
   ```
   
   
   So I checked the log file immediately, and there is no error.
   ```
   .........
   [2019-06-04T13:16:21.424Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
limits.actions.sequence.maxLength
   [2019-06-04T13:16:21.424Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
limits.actions.invokes.concurrent
   [2019-06-04T13:16:21.425Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
controller.instances
   [2019-06-04T13:16:21.425Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
limits.actions.invokes.perMinute
   [2019-06-04T13:16:21.425Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
runtimes.manifest
   [2019-06-04T13:16:21.426Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
kafka.hosts
   [2019-06-04T13:16:21.426Z] [INFO] [#tid_sid_unknown] [Config] environment set value for
port
   [2019-06-04T13:16:28.344Z] [INFO] [#tid_sid_unknown] [KafkaMessagingProvider] created topic
completed0
   [2019-06-04T13:16:56.275Z] [INFO] [#tid_sid_unknown] [KafkaMessagingProvider] created topic
health
   [2019-06-04T13:17:47.819Z] [INFO] [#tid_sid_unknown] [KafkaMessagingProvider] created topic
cacheInvalidation
   
   ```
   
   
   But five minutes later, I checked the log file again, and it contained more information
as below.
   
   ```
   [2019-06-04T13:19:47.953Z] [WARN] [#tid_sid_unknown] [KafkaMessagingProvider] topic events
could not be created because of org.apache.kafka.common.errors.TimeoutException: The request
timed out., retries left: 5
   [2019-06-04T13:20:31.962Z] [INFO] [#tid_sid_unknown] [KafkaMessagingProvider] topic events
already existed
   [2019-06-04T13:20:32.200Z] [INFO] [#tid_sid_controller] [Controller] starting controller
instance 0 [marker:controller_startup0_counter:250778]
   [2019-06-04T13:20:32.201Z] [INFO] Starting the Kamon(SystemMetrics) module
   [2019-06-04T13:20:32.968Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity
= 128, pipeline fill at = 128, pipeline depth = 256
   [2019-06-04T13:20:33.113Z] [INFO] [#tid_sid_loadbalancer] [ShardingContainerPoolBalancerState]
managedFraction = 0.9, blackboxFraction = 0.1
   [2019-06-04T13:20:33.141Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity
= 128, pipeline fill at = 128, pipeline depth = 256
   [2019-06-04T13:20:33.407Z] [INFO] [#tid_sid_loadbalancer] [WhiskAction] [GET] serving from
datastore: CacheKey(whisk.system/invokerHealthTestAction0) [marker:database_cacheMiss_counter:251985]
   [2019-06-04T13:20:33.443Z] [INFO] [#tid_sid_loadbalancer] [CouchDbRestStore] [GET] 'whisk_local_whisks'
finding document: 'id: whisk.system/invokerHealthTestAction0' [marker:database_getDocument_start:252022]
   [2019-06-04T13:20:33.760Z] [INFO] [#tid_sid_loadbalancer] [CouchDbRestStore]  [marker:database_getDocument_finish:252339:315]
   [2019-06-04T13:20:33.763Z] [INFO] [#tid_sid_loadbalancer] [WhiskAction] invalidating CacheKey(whisk.system/invokerHealthTestAction0)
   [2019-06-04T13:20:33.771Z] [INFO] [#tid_sid_loadbalancer] [WhiskAction] write initiated
on new cache entry
   [2019-06-04T13:20:33.776Z] [INFO] [#tid_sid_loadbalancer] [CouchDbRestStore] [PUT] 'whisk_local_whisks'
saving document: 'id: whisk.system/invokerHealthTestAction0, rev: null' [marker:database_saveDocument_start:252355]
   [2019-06-04T13:20:33.798Z] [INFO] [#tid_sid_loadbalancer] [InvokerPool] test action for
invoker health now exists
   [2019-06-04T13:20:33.913Z] [INFO] [#tid_sid_controller] [Controller] loadbalancer initialized:
ShardingContainerPoolBalancer
   [2019-06-04T13:20:33.915Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity
= 128, pipeline fill at = 128, pipeline depth = 256
   [2019-06-04T13:20:33.919Z] [INFO] [#tid_sid_loadbalancer] [CouchDbRestStore]  [marker:database_saveDocument_finish:252496:141]
   [2019-06-04T13:20:33.927Z] [INFO] [#tid_sid_loadbalancer] [WhiskAction] write all done,
caching CacheKey(whisk.system/invokerHealthTestAction0) Cached
   [2019-06-04T13:20:33.995Z] [INFO] [#tid_sid_controller] [KindRestrictor] all kinds are
allowed, the white-list is not specified
   [2019-06-04T13:20:34.394Z] [INFO] [#tid_sid_controller] [ActionsApi] actionSequenceLimit
'99'
   [2019-06-04T13:23:24.224Z] [ERROR] [Consumer clientId=consumer-health, groupId=health0]
Offset commit failed on partition health-0 at offset 0: The coordinator is not aware of this
member.
   [2019-06-04T13:23:24.232Z] [ERROR] [#tid_sid_dispatcher] [MessageFeed] failed to commit
ping consumer offset: org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot
be completed since the group has already rebalanced and assigned the partitions to another
member. This means that the time between subsequent calls to poll() was longer than the configured
max.poll.interval.ms, which typically implies that the poll loop is spending too much time
message processing. You can address this either by increasing the session timeout or by reducing
the maximum size of batches returned in poll() with max.poll.records.
   ```
   
   Which means that the issue arises between 13:17:47 and 13:19:47. Is it caused by kafka
creating topic events? And how can I handle it?
   
   Any response will be appreciated!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message