Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A8304200BF1 for ; Tue, 3 Jan 2017 18:09:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A6E21160B43; Tue, 3 Jan 2017 17:09:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A3450160B20 for ; Tue, 3 Jan 2017 18:08:59 +0100 (CET) Received: (qmail 60728 invoked by uid 500); 3 Jan 2017 17:08:58 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 60699 invoked by uid 99); 3 Jan 2017 17:08:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jan 2017 17:08:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 704B02C0453 for ; Tue, 3 Jan 2017 17:08:58 +0000 (UTC) Date: Tue, 3 Jan 2017 17:08:58 +0000 (UTC) From: "Matthias J. Sax (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (KAFKA-4582) KStream job fails in multi-thread mode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 03 Jan 2017 17:09:00 -0000 [ https://issues.apache.org/jira/browse/KAFKA-4582?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-4582. ------------------------------------ Resolution: Duplicate > KStream job fails in multi-thread mode=20 > --------------------------------------- > > Key: KAFKA-4582 > URL: https://issues.apache.org/jira/browse/KAFKA-4582 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.0.1 > Environment: t2.large EC2 AWS instance > Reporter: Manuel Romero > > We are experimenting an issue with KStream when using more than one threa= d. Currently, we are using 2 threads in a t2.large EC2 AWS instance. Accord= ing to the searches we made, this error should be solved in the version we = are currently using. It seems the job fails to rebalance due to an error on= the lock on the state directory. This is the full log. > 16/12/19 16:36:36 INFO internals.AbstractCoordinator: (Re-)joining group = dougal_text_pipeline_id =20 > 16/12/19 16:36:36 INFO assignment.TaskAssignor: stream-thread [StreamThre= ad-6] Assigning tasks to clients: {a2a7324e-43da-4b17-878b-53f62a0fd870=3D[= activeTasks: ([]) assignedTasks: ([]) prevActiveTasks: ([0_32, 0_2, 0_68, 0= _38, 0_8, 0_74, 0_44, 0_14, 0_80, 0_50, 0_20, 0_86, 0_56, 0_26, 0_92, 0_62]= ) prevAssignedTasks: ([0_32, 0_2, 0_68, 0_38, 0_8, 0_74, 0_44, 0_14, 0_80, = 0_50, 0_20, 0_86, 0_56, 0_26, 0_92, 0_62]) capacity: 1.0 cost: 0.0]}, prevA= ssignmentBalanced: false, prevClientsUnchanged: false, tasks: [0_0, 0_1, 0_= 2, 0_3, 0_4, 0_5, 0_6, 0_7, 0_8, 0_9, 0_10, 0_11, 0_12, 0_13, 0_14, 0_15, 0= _16, 0_17, 0_18, 0_19, 0_20, 0_21, 0_22, 0_23, 0_24, 0_25, 0_26, 0_27, 0_28= , 0_29, 0_30, 0_31, 0_32, 0_33, 0_34, 0_35, 0_36, 0_37, 0_38, 0_39, 0_40, 0= _41, 0_42, 0_43, 0_44, 0_45, 0_46, 0_47, 0_48, 0_49, 0_50, 0_51, 0_52, 0_53= , 0_54, 0_55, 0_56, 0_57, 0_58, 0_59, 0_60, 0_61, 0_62, 0_63, 0_64, 0_65, 0= _66, 0_67, 0_68, 0_69, 0_70, 0_71, 0_72, 0_73, 0_74, 0_75, 0_76, 0_77, 0_78= , 0_79, 0_80, 0_81, 0_82, 0_83, 0_84, 0_85, 0_86, 0_87, 0_88, 0_89, 0_90, 0= _91, 0_92, 0_93, 0_94, 0_95], replicas: 0 =20 > 16/12/19 16:36:36 INFO assignment.TaskAssignor: stream-thread [StreamThre= ad-6] Assigned with: {a2a7324e-43da-4b17-878b-53f62a0fd870=3D[activeTasks: = ([0_0, 0_1, 0_2, 0_3, 0_4, 0_5, 0_6, 0_7, 0_8, 0_9, 0_10, 0_11, 0_12, 0_13,= 0_14, 0_15, 0_16, 0_17, 0_18, 0_19, 0_20, 0_21, 0_22, 0_23, 0_24, 0_25, 0_= 26, 0_27, 0_28, 0_29, 0_30, 0_31, 0_32, 0_33, 0_34, 0_35, 0_36, 0_37, 0_38,= 0_39, 0_40, 0_41, 0_42, 0_43, 0_44, 0_45, 0_46, 0_47, 0_48, 0_49, 0_50, 0_= 51, 0_52, 0_53, 0_54, 0_55, 0_56, 0_57, 0_58, 0_59, 0_60, 0_61, 0_62, 0_63,= 0_64, 0_65, 0_66, 0_67, 0_68, 0_69, 0_70, 0_71, 0_72, 0_73, 0_74, 0_75, 0_= 76, 0_77, 0_78, 0_79, 0_80, 0_81, 0_82, 0_83, 0_84, 0_85, 0_86, 0_87, 0_88,= 0_89, 0_90, 0_91, 0_92, 0_93, 0_94, 0_95]) assignedTasks: ([0_0, 0_1, 0_2,= 0_3, 0_4, 0_5, 0_6, 0_7, 0_8, 0_9, 0_10, 0_11, 0_12, 0_13, 0_14, 0_15, 0_1= 6, 0_17, 0_18, 0_19, 0_20, 0_21, 0_22, 0_23, 0_24, 0_25, 0_26, 0_27, 0_28, = 0_29, 0_30, 0_31, 0_32, 0_33, 0_34, 0_35, 0_36, 0_37, 0_38, 0_39, 0_40, 0_4= 1, 0_42, 0_43, 0_44, 0_45, 0_46, 0_47, 0_48, 0_49, 0_50, 0_51, 0_52, 0_53, = 0_54, 0_55, 0_56, 0_57, 0_58, 0_59, 0_60, 0_61, 0_62, 0_63, 0_64, 0_65, 0_6= 6, 0_67, 0_68, 0_69, 0_70, 0_71, 0_72, 0_73, 0_74, 0_75, 0_76, 0_77, 0_78, = 0_79, 0_80, 0_81, 0_82, 0_83, 0_84, 0_85, 0_86, 0_87, 0_88, 0_89, 0_90, 0_9= 1, 0_92, 0_93, 0_94, 0_95]) prevActiveTasks: ([]) prevAssignedTasks: ([]) c= apacity: 1.0 cost: 41.600000000000016]} =20 > 16/12/19 16:36:36 INFO internals.AbstractCoordinator: Successfully joined= group dougal_text_pipeline_id with generation 21 =20 > 16/12/19 16:36:36 INFO internals.ConsumerCoordinator: Setting newly assig= ned partitions [crawler-fashion-content-83, crawler-fashion-content-17, cra= wler-fashion-content-50, crawler-fashion-content-91, crawler-fashion-conten= t-25, crawler-fashion-content-58, crawler-fashion-content-33, crawler-fashi= on-content-66, crawler-fashion-content-0, crawler-fashion-content-41, crawl= er-fashion-content-74, crawler-fashion-content-8, crawler-fashion-content-4= 9, crawler-fashion-content-82, crawler-fashion-content-16, crawler-fashion-= content-57, crawler-fashion-content-90, crawler-fashion-content-24, crawler= -fashion-content-65, crawler-fashion-content-32, crawler-fashion-content-73= , crawler-fashion-content-7, crawler-fashion-content-40, crawler-fashion-co= ntent-81, crawler-fashion-content-15, crawler-fashion-content-48, crawler-f= ashion-content-89, crawler-fashion-content-23, crawler-fashion-content-56, = crawler-fashion-content-31, crawler-fashion-content-64, crawler-fashion-con= tent-39, crawler-fashion-content-72, crawler-fashion-content-6, crawler-fas= hion-content-47, crawler-fashion-content-80, crawler-fashion-content-14, cr= awler-fashion-content-55, crawler-fashion-content-88, crawler-fashion-conte= nt-22, crawler-fashion-content-63, crawler-fashion-content-30, crawler-fash= ion-content-71, crawler-fashion-content-5, crawler-fashion-content-38, craw= ler-fashion-content-79, crawler-fashion-content-13, crawler-fashion-content= -46, crawler-fashion-content-87, crawler-fashion-content-21, crawler-fashio= n-content-54, crawler-fashion-content-4, crawler-fashion-content-95, crawle= r-fashion-content-29, crawler-fashion-content-62, crawler-fashion-content-1= 2, crawler-fashion-content-37, crawler-fashion-content-70, crawler-fashion-= content-20, crawler-fashion-content-45, crawler-fashion-content-78, crawler= -fashion-content-28, crawler-fashion-content-53, crawler-fashion-content-86= , crawler-fashion-content-3, crawler-fashion-content-36, crawler-fashion-co= ntent-61, crawler-fashion-content-94, crawler-fashion-content-11, crawler-f= ashion-content-44, crawler-fashion-content-69, crawler-fashion-content-19, = crawler-fashion-content-52, crawler-fashion-content-77, crawler-fashion-con= tent-27, crawler-fashion-content-60, crawler-fashion-content-85, crawler-fa= shion-content-35, crawler-fashion-content-68, crawler-fashion-content-2, cr= awler-fashion-content-93, crawler-fashion-content-43, crawler-fashion-conte= nt-76, crawler-fashion-content-10, crawler-fashion-content-51, crawler-fash= ion-content-84, crawler-fashion-content-18, crawler-fashion-content-59, cra= wler-fashion-content-92, crawler-fashion-content-26, crawler-fashion-conten= t-67, crawler-fashion-content-1, crawler-fashion-content-34, crawler-fashio= n-content-75, crawler-fashion-content-9, crawler-fashion-content-42] for gr= oup dougal_text_pipeline_id =20 > 16/12/19 16:36:36 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] New partitions [[crawler-fashion-content-83, crawler-fashion-content-1= 7, crawler-fashion-content-50, crawler-fashion-content-91, crawler-fashion-= content-25, crawler-fashion-content-58, crawler-fashion-content-33, crawler= -fashion-content-66, crawler-fashion-content-0, crawler-fashion-content-41,= crawler-fashion-content-74, crawler-fashion-content-8, crawler-fashion-con= tent-49, crawler-fashion-content-82, crawler-fashion-content-16, crawler-fa= shion-content-57, crawler-fashion-content-90, crawler-fashion-content-24, c= rawler-fashion-content-65, crawler-fashion-content-32, crawler-fashion-cont= ent-73, crawler-fashion-content-7, crawler-fashion-content-40, crawler-fash= ion-content-81, crawler-fashion-content-15, crawler-fashion-content-48, cra= wler-fashion-content-89, crawler-fashion-content-23, crawler-fashion-conten= t-56, crawler-fashion-content-31, crawler-fashion-content-64, crawler-fashi= on-content-39, crawler-fashion-content-72, crawler-fashion-content-6, crawl= er-fashion-content-47, crawler-fashion-content-80, crawler-fashion-content-= 14, crawler-fashion-content-55, crawler-fashion-content-88, crawler-fashion= -content-22, crawler-fashion-content-63, crawler-fashion-content-30, crawle= r-fashion-content-71, crawler-fashion-content-5, crawler-fashion-content-38= , crawler-fashion-content-79, crawler-fashion-content-13, crawler-fashion-c= ontent-46, crawler-fashion-content-87, crawler-fashion-content-21, crawler-= fashion-content-54, crawler-fashion-content-4, crawler-fashion-content-95, = crawler-fashion-content-29, crawler-fashion-content-62, crawler-fashion-con= tent-12, crawler-fashion-content-37, crawler-fashion-content-70, crawler-fa= shion-content-20, crawler-fashion-content-45, crawler-fashion-content-78, c= rawler-fashion-content-28, crawler-fashion-content-53, crawler-fashion-cont= ent-86, crawler-fashion-content-3, crawler-fashion-content-36, crawler-fash= ion-content-61, crawler-fashion-content-94, crawler-fashion-content-11, cra= wler-fashion-content-44, crawler-fashion-content-69, crawler-fashion-conten= t-19, crawler-fashion-content-52, crawler-fashion-content-77, crawler-fashi= on-content-27, crawler-fashion-content-60, crawler-fashion-content-85, craw= ler-fashion-content-35, crawler-fashion-content-68, crawler-fashion-content= -2, crawler-fashion-content-93, crawler-fashion-content-43, crawler-fashion= -content-76, crawler-fashion-content-10, crawler-fashion-content-51, crawle= r-fashion-content-84, crawler-fashion-content-18, crawler-fashion-content-5= 9, crawler-fashion-content-92, crawler-fashion-content-26, crawler-fashion-= content-67, crawler-fashion-content-1, crawler-fashion-content-34, crawler-= fashion-content-75, crawler-fashion-content-9, crawler-fashion-content-42]]= assigned at the end of consumer rebalance. =20 > 16/12/19 16:36:36 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] Creating active task 0_0 with assigned partitions [[crawler-fashion-co= ntent-0]] =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneCoordinator: get() calle= d =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_0] Initializing stat= e stores =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_0] Initializing proc= essor nodes of the topology =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneValueTransformer: Init c= alled with org.apache.kafka.streams.processor.internals.ProcessorContextImp= l@43350b2e 0_0 =20 > 16/12/19 16:36:36 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] Creating active task 0_1 with assigned partitions [[crawler-fashion-co= ntent-1]] =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneCoordinator: get() calle= d =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_1] Initializing stat= e stores =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_1] Initializing proc= essor nodes of the topology =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneValueTransformer: Init c= alled with org.apache.kafka.streams.processor.internals.ProcessorContextImp= l@5d045471 0_1 =20 > 16/12/19 16:36:36 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] Creating active task 0_2 with assigned partitions [[crawler-fashion-co= ntent-2]] =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneCoordinator: get() calle= d =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_2] Initializing stat= e stores =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_2] Initializing proc= essor nodes of the topology =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneValueTransformer: Init c= alled with org.apache.kafka.streams.processor.internals.ProcessorContextImp= l@215e155a 0_2 =20 > 16/12/19 16:36:36 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] Creating active task 0_3 with assigned partitions [[crawler-fashion-co= ntent-3]] =20 > 16/12/19 16:36:36 INFO kafka.KafkaStreamsBackboneCoordinator: get() calle= d =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_3] Initializing stat= e stores =20 > 16/12/19 16:36:36 INFO internals.StreamTask: task [0_3] Initializing proc= essor nodes of the topology =20 > #011at org.apache.kafka.streams.processor.internals.StreamTask.(Str= eamTask.java:90) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.createSt= reamTask(StreamThread.java:633) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.addStrea= mTasks(StreamThread.java:660) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.access$1= 00(StreamThread.java:69) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread$1.onPart= itionsAssigned(StreamThread.java:124) =20 > #011at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.on= JoinComplete(ConsumerCoordinator.java:228) =20 > #011at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.jo= inGroupIfNeeded(AbstractCoordinator.java:313) =20 > #011at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.en= sureActiveGroup(AbstractCoordinator.java:277) =20 > #011at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.po= ll(ConsumerCoordinator.java:259) =20 > #011at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaCons= umer.java:1013) =20 > #011at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer= .java:979) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(= StreamThread.java:407) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.run(Stre= amThread.java:242) =20 > Caused by: java.io.IOException: task [0_6] Failed to lock the state direc= tory: /tmp/kafka-streams/dougal_text_pipeline_id/0_6 =20 > #011at org.apache.kafka.streams.processor.internals.ProcessorStateManager= .(ProcessorStateManager.java:101) =20 > #011at org.apache.kafka.streams.processor.internals.AbstractTask.(A= bstractTask.java:69) =20 > #011... 13 more =20 > 16/12/19 16:36:37 ERROR internals.ConsumerCoordinator: User provided list= ener org.apache.kafka.streams.processor.internals.StreamThread$1 for group = dougal_text_pipeline_id failed on partition assignment =20 > org.apache.kafka.streams.errors.ProcessorStateException: task [0_6] Error= while creating the state manager =20 > #011at org.apache.kafka.streams.processor.internals.AbstractTask.(A= bstractTask.java:72) > #011at org.apache.kafka.streams.processor.internals.StreamTask.(Str= eamTask.java:90) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.createSt= reamTask(StreamThread.java:633) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.addStrea= mTasks(StreamThread.java:660) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.access$1= 00(StreamThread.java:69) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread$1.onPart= itionsAssigned(StreamThread.java:124) =20 > #011at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.on= JoinComplete(ConsumerCoordinator.java:228) =20 > #011at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.jo= inGroupIfNeeded(AbstractCoordinator.java:313) =20 > #011at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.en= sureActiveGroup(AbstractCoordinator.java:277) =20 > #011at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.po= ll(ConsumerCoordinator.java:259) =20 > #011at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaCons= umer.java:1013) =20 > #011at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer= .java:979) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(= StreamThread.java:407) =20 > #011at org.apache.kafka.streams.processor.internals.StreamThread.run(Stre= amThread.java:242) =20 > Caused by: java.io.IOException: task [0_6] Failed to lock the state direc= tory: /tmp/kafka-streams/dougal_text_pipeline_id/0_6 =20 > #011at org.apache.kafka.streams.processor.internals.ProcessorStateManager= .(ProcessorStateManager.java:101) =20 > #011at org.apache.kafka.streams.processor.internals.AbstractTask.(A= bstractTask.java:69) =20 > #011... 13 more =20 > 16/12/19 16:36:37 INFO internals.StreamThread: stream-thread [StreamThrea= d-6] Shutting down =20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)