From user-return-74573-archive-asf-public=cust-asf.ponee.io@spark.apache.org Wed Apr 18 00:34:34 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B723C180649 for ; Wed, 18 Apr 2018 00:34:33 +0200 (CEST) Received: (qmail 73368 invoked by uid 500); 17 Apr 2018 22:34:31 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 73358 invoked by uid 99); 17 Apr 2018 22:34:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Apr 2018 22:34:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C3BFFC012D for ; Tue, 17 Apr 2018 22:34:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.299 X-Spam-Level: * X-Spam-Status: No, score=1.299 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=koeninger-org.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 0f9BoSooKATA for ; Tue, 17 Apr 2018 22:34:28 +0000 (UTC) Received: from mail-it0-f50.google.com (mail-it0-f50.google.com [209.85.214.50]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AAC0D5F188 for ; Tue, 17 Apr 2018 22:34:27 +0000 (UTC) Received: by mail-it0-f50.google.com with SMTP id t192-v6so16022267itc.1 for ; Tue, 17 Apr 2018 15:34:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=koeninger-org.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=QS6hf3LuMfiGUMaVY/lWi8n/lWGL6X3o5Nb1IqbEdxs=; b=g4xCnkHJzZjOZD1EIE3ynx/xUkUrCeuNiq1IzlAlfBAmc1eujuGazgABEmx/M9AwVG VGOiA+nh5Upw5/P+i2XONvQ4U7NqpFKkIUxGw0dwnM2Fn3p4jSyEMfkY6fdaQBNRyP4d 9WY5nX++PMNM+OSEHzykXynP+VIp+dZjXxshZ9sPke2nb8oW0tMtcLha4ctTQ5UZDxUQ AfnoYXzZpU6vpjRXrDRWhNcxCeahet9Zn458l/TQzNIg9QZwzxvTyFgrO7fJyWQutX6A 3EmpRxVC5gRmD//BvKetlglyBhZDn+FLWu4LDaRVjXSCowjem01g6Ezh2dOxT2rpB2L4 U/9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=QS6hf3LuMfiGUMaVY/lWi8n/lWGL6X3o5Nb1IqbEdxs=; b=EKLidncFu4rLvEPbNvDPL2vXpJKXBKGBpPOllUkmN9e+8j35cErNTfzc9VEZrnVtXx XrUIVySvlmzTuU7fBvf54hsBxGIBW6wL9necaKDLgHmOKuW1xs7kAjqmoKwC5xOn37DA ZCUdQ7Oz/Q27iYuunFuj6p8aggpSP6QZkTd8Dv7AQ1bxV8kgCIU2TOkN+/XPvKIn41f1 Og5Z7Jz/ewS0061GmOUwZ8q3g5QJ+Q1jNBHdqr6+xIAfhe0vZAjkkWs6Q7EmYJLuY6tK epX1P3Q3Z4J/i9laf36puuZkN0cypql0YW3CwJ42Ia/Frl+xtO4AhNR2+BZpLOjDnXhF yMew== X-Gm-Message-State: ALQs6tDpii8nfu0U75OhY+XznkFqBzI5B6+RKnY+YyondZ+6XfOSDHmr MEIZUWUcmr22AINrlyULfY1wufUMVDtykY3kdIGCaw== X-Google-Smtp-Source: AIpwx4+3Vss1rWW91xJOjtOC4M43wgOQaUTQSiS4I0wLI490niAOLrwRI5rQhV4ilD4fd4syiq/FM5cryOM60HcCog0= X-Received: by 2002:a24:d7c4:: with SMTP id y187-v6mr34385itg.105.1524004466074; Tue, 17 Apr 2018 15:34:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.83.193 with HTTP; Tue, 17 Apr 2018 15:34:25 -0700 (PDT) In-Reply-To: References: From: Cody Koeninger Date: Tue, 17 Apr 2018 17:34:25 -0500 Message-ID: Subject: Re: Structured streaming: Tried to fetch $offset but the returned record offset was ${record.offset}" To: ARAVIND SETHURATHNAM Cc: "user@spark.apache.org" Content-Type: multipart/related; boundary="000000000000d6988e056a12f05c" --000000000000d6988e056a12f05c Content-Type: multipart/alternative; boundary="000000000000d6988c056a12f05b" --000000000000d6988c056a12f05b Content-Type: text/plain; charset="UTF-8" Is this possibly related to the recent post on https://issues.apache.org/jira/browse/SPARK-18057 ? On Mon, Apr 16, 2018 at 11:57 AM, ARAVIND SETHURATHNAM < asethurathnam@homeaway.com.invalid> wrote: > Hi, > > We have several structured streaming jobs (spark version 2.2.0) consuming > from kafka and writing to s3. They were running fine for a month, since > yesterday few jobs started failing and I see the below exception in the > failed jobs log, > > > > ```Tried to fetch 473151075 but the returned record offset was 473151072``` > ```GScheduler: ResultStage 0 (start at SparkStreamingTask.java:222) failed > in 77.546 s due to Job aborted due to stage failure: Task 86 in stage 0.0 > failed 4 times, most recent failure: Lost task 86.3 in stage 0.0 (TID 96, > ip-10-120-12-52.ec2.internal, executor 11): java.lang.IllegalStateException: > Tried to fetch 473151075 but the returned record offset was 473151072 > at org.apache.spark.sql.kafka010.CachedKafkaConsumer.fetchData( > CachedKafkaConsumer.scala:234) > at org.apache.spark.sql.kafka010.CachedKafkaConsumer.get( > CachedKafkaConsumer.scala:106) > at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1. > getNext(KafkaSourceRDD.scala:158) > at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1. > getNext(KafkaSourceRDD.scala:149) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at org.apache.spark.sql.catalyst.expressions.GeneratedClass$ > GeneratedIterator.processNext(Unknown Source) > at org.apache.spark.sql.execution.BufferedRowIterator. > hasNext(BufferedRowIterator.java:43) > at org.apache.spark.sql.execution.WholeStageCodegenExec$$ > anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.convert.Wrappers$IteratorWrapper. > hasNext(Wrappers.scala:30) > ` > > > > can someone provide some direction what could be causing this all of a > sudden when consuming from those topics? > > > > regards > > [image: https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif] > > Aravind > > > --000000000000d6988c056a12f05b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Is this possibly related to the recent post on=C2=A0https://issues.apac= he.org/jira/browse/SPARK-18057 ?

On Mon, Apr 16, 2018 at 11:57 AM, ARAVIND SETHURAT= HNAM <asethurathnam@homeaway.com.invalid> w= rote:

Hi,=

We have sever= al structured streaming jobs (spark version 2.2.0) consuming from kafka and= writing to s3. They were running fine for a month, since yesterday few jobs started failing and I see the below exception in = the failed jobs=C2=A0 log,

=C2=A0=

```Tried to fetch 473151075 but the returned record offset was 47= 3151072```
```GScheduler: ResultStage 0 (start a= t SparkStreamingTask.java:222) failed in 77.546 s due to Job aborted due to= stage failure: Task 86 in stage 0.0 failed 4 times, most recent failure: L= ost task 86.3 in stage 0.0 (TID 96, ip-10-120-12-52.ec2.internal, executor 11): java.lang.IllegalStateExc= eption: Tried to fetch 473151075 but the returned record offset was 4731510= 72
at org.apache.spark.sql.kafka010.CachedKafkaConsumer.fetchData(CachedKafkaConsumer.scala:234) at org.apache.spark.sql.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:106)
at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1.getNext(KafkaSourceRDD.scala:158)<= br> at org.apache.spark.sql.kafka010.KafkaSourceRDD$$anon$1.getNext(KafkaSourceRDD.scala:149)<= br> at org.apache.spark.util.NextIte= rator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$an= on$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$an= on$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown= Source)
at org.apache.spark.sql.executio= n.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)=
at org.apache.spark.sql.executio= n.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStag= eCodegenExec.scala:395)
at scala.collection.Iterator$$an= on$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$an= on$11.hasNext(Iterator.scala:408)
at scala.collection.convert.Wrap= pers$IteratorWrapper.hasNext(Wrappers.scala:30)
`
=

=C2=A0=

can someone p= rovide some direction what could be causing this all of a sudden when consu= ming from those topics?=C2=A0

=C2=A0=

regards

=3D"https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"

Aravind

=C2=A0


--000000000000d6988c056a12f05b-- --000000000000d6988e056a12f05c Content-Type: image/gif; name="image001.gif" Content-Disposition: inline; filename="image001.gif" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: 863b1d02e30e3888_0.1 R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw== --000000000000d6988e056a12f05c--