From issues-return-187542-archive-asf-public=cust-asf.ponee.io@spark.apache.org Wed Mar 21 09:39:07 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 8A88B18076D for ; Wed, 21 Mar 2018 09:39:06 +0100 (CET) Received: (qmail 31642 invoked by uid 500); 21 Mar 2018 08:39:05 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 31160 invoked by uid 99); 21 Mar 2018 08:39:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Mar 2018 08:39:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 657261A17CE for ; Wed, 21 Mar 2018 08:39:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -102.311 X-Spam-Level: X-Spam-Status: No, score=-102.311 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id o9Ct2Mr1tfY9 for ; Wed, 21 Mar 2018 08:39:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 957C55FDAB for ; Wed, 21 Mar 2018 08:39:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id E0B6DE0D76 for ; Wed, 21 Mar 2018 08:39:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B4FC8214CC for ; Wed, 21 Mar 2018 08:39:00 +0000 (UTC) Date: Wed, 21 Mar 2018 08:39:00 +0000 (UTC) From: "Florencio (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-23739) Spark structured streaming long running problem MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-23739?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D164= 07611#comment-16407611 ]=20 Florencio commented on SPARK-23739: ----------------------------------- Thanks for your response. I am getting this error after some time, it could= be hours or days. The spark-submit command is: _export SPARK_MAJOR_VERSION=3D2_ _spark-submit --master yarn --deploy-mode client \_ _--executor-cores 6 --num-executors 4 --driver-memory 2G --executor-memory = 2G \_ _--jars /usr/hdp/current/hbase-client/lib/hbase-client-1.1.2.2.6.1.0-129.ja= r,/usr/hdp/current/hbase-client/lib/hbase-common-1.1.2.2.6.1.0-129.jar,/usr= /hdp/current/hbase-client/lib/hbase-server-1.1.2.2.6.1.0-129.jar,/usr/hdp/c= urrent/hbase-client/lib/hbase-protocol-1.1.2.2.6.1.0-129.jar \_ _--class=C2=A0classname jarname=C2=A0\_ _-topic=3Dtopicname\_ _-tablehbase=3D=C2=A0namehbasetable\_ _-checkpointLocation=3D/tmp/PassingStructuredStreamingEntrate \_ =C2=A0 Some useful information could be the configuration of the structured stream= : _val ds1 =3D spark.readStream_ _.format("kafka")_ _.option("kafka.bootstrap.servers", "serversnames")_ _.option("startingOffsets","latest")_ _.option("failOnDataLoss","false")_=C2=A0 _.option("fetchOffset.numRetries",5)_ _.option("subscribe", topic)_ _.load()_ =C2=A0 _val writer =3D new HbaseSink(tablehbase)_ _val query =3D results.writeStream.foreach(writer)_ _.outputMode("complete")_ _.option("checkpointLocation", checkpointLocation)_ _.start()_ =C2=A0 =C2=A0 Thanks. =C2=A0 =C2=A0 > Spark structured streaming long running problem > ----------------------------------------------- > > Key: SPARK-23739 > URL: https://issues.apache.org/jira/browse/SPARK-23739 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.1.0 > Reporter: Florencio > Priority: Critical > Labels: spark, streaming, structured > > I had a problem with long running spark structured streaming in spark 2.1= . Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.requ= ests.LeaveGroupResponse. > The detailed error is the following: > 18/03/16 16:10:57 INFO StreamExecution: Committed offsets for batch 2110.= Metadata OffsetSeqMetadata(0,1521216656590) > 18/03/16 16:10:57 INFO KafkaSource: GetBatch called with start =3D Some(\= {"TopicName":{"2":5520197,"1":5521045,"3":5522054,"0":5527915}}), end =3D \= {"TopicName":{"2":5522730,"1":5523577,"3":5524586,"0":5530441}} > 18/03/16 16:10:57 INFO KafkaSource: Partitions added: Map() > 18/03/16 16:10:57 ERROR StreamExecution: Query [id =3D a233b9ff-cc39-44d3= -b953-a255986c04bf, runId =3D 8520e3c0-2455-4ac1-9021-8518fb58b3f8] termina= ted with error > java.util.zip.ZipException: invalid code lengths set > at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at java.io.FilterInputStream.read(FilterInputStream.java:107) > at org.apache.spark.util.Utils$$anonfun$copyStream$1.apply$mcJ$sp(Utils.= scala:354) > at org.apache.spark.util.Utils$$anonfun$copyStream$1.apply(Utils.scala:3= 22) > at org.apache.spark.util.Utils$$anonfun$copyStream$1.apply(Utils.scala:3= 22) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303) > at org.apache.spark.util.Utils$.copyStream(Utils.scala:362) > at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.s= cala:45) > at org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureC= leaner.scala:83) > at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCl= eaner$$clean(ClosureCleaner.scala:173) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2101) > at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370) > at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:369) > at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.s= cala:151) > at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.s= cala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) > at org.apache.spark.rdd.RDD.map(RDD.scala:369) > at org.apache.spark.sql.kafka010.KafkaSource.getBatch(KafkaSource.scala:= 287) > at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org= $apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$= apply$6.apply(StreamExecution.scala:503) > at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org= $apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$= apply$6.apply(StreamExecution.scala:499) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(Traversable= Like.scala:241) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(Traversable= Like.scala:241) > at scala.collection.Iterator$class.foreach(Iterator.scala:893) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) > at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) > at org.apache.spark.sql.execution.streaming.StreamProgress.foreach(Strea= mProgress.scala:25) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:= 241) > 18/03/16 16:10:57 ERROR ClientUtils: Failed to close coordinator > java.lang.NoClassDefFoundError: org/apache/kafka/common/requests/LeaveGro= upResponse > at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendL= eaveGroupRequest(AbstractCoordinator.java:575) > at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.maybe= LeaveGroup(AbstractCoordinator.java:566) > at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.close= (AbstractCoordinator.java:555) > at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.close= (ConsumerCoordinator.java:377) > at org.apache.kafka.clients.ClientUtils.closeQuietly(ClientUtils.java:66= ) > at org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.j= ava:1383) > at org.apache.kafka.clients.consumer.KafkaConsumer.close(KafkaConsumer.j= ava:1364) > at org.apache.spark.sql.kafka010.KafkaSource.stop(KafkaSource.scala:311) > at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$sto= pSources$1.apply(StreamExecution.scala:574) > at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$sto= pSources$1.apply(StreamExecution.scala:572) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.= scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at org.apache.spark.sql.execution.streaming.StreamExecution.stopSources(= StreamExecution.scala:572) > at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$s= park$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.sc= ala:325) > at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(= StreamExecution.scala:191) > Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.requ= ests.LeaveGroupResponse > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 15 more > 18/03/16 16:10:57 WARN StreamExecution: Failed to stop streaming source: = KafkaSource[Subscribe[TPusciteStazMinuto]]. Resources may have leaked. > org.apache.kafka.common.KafkaException: Failed to close kafka consumer > =C2=A0 > =C2=A0 > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org