beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Peterson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-3881) Failure reading backlog in KinesisIO
Date Wed, 21 Mar 2018 20:45:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408557#comment-16408557
] 

Kevin Peterson commented on BEAM-3881:
--------------------------------------

1. It's happened on both deploys so far. I'll see if it happens when I push next and update.
2. It happens immediately upon deploy. It doesn't appear fatal, and the pipeline starts processing
messages. The error only happens at startup, and then go away - the pipeline appears to start
processing messages like normal, and autoscales as expected.
3. Not very many. I don't know the exact number, but Dataflow only uses a single worker to
process them.
4. A Joda Instant less than a day old.
5. Let me work on that.

Based on your questions, I'd guess that maybe something isn't being initialized correctly
leading to the error. Once it is initialized, things run fine. Maybe the Kinesis current timestamp
estimation starts at 0?

> Failure reading backlog in KinesisIO
> ------------------------------------
>
>                 Key: BEAM-3881
>                 URL: https://issues.apache.org/jira/browse/BEAM-3881
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-kinesis
>            Reporter: Kevin Peterson
>            Assignee: Alexey Romanenko
>            Priority: Major
>
> I'm gettingĀ an error when reading from Kinesis in my pipeline. Using Beam v2.3, running
on Google Cloud Dataflow.
> I'm constructing the source via:
> {code:java}
> KinesisIO.Read read = KinesisIO
>                 .read()
>                 .withAWSClientsProvider(
>                     configuration.getAwsAccessKeyId(),
>                     configuration.getAwsSecretAccessKey(),
>                     region)
>                 .withStreamName(configuration.getKinesisStream())
>                 .withUpToDateThreshold(Duration.standardMinutes(30))
>                 .withInitialTimestampInStream(configuration.getStartTime());
> {code}
> The exception is:
> {noformat}
> Mar 19, 2018 12:54:41 PM org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler
process
> SEVERE: 2018-03-19T19:54:53.010Z: (2896b8774de760ec): java.lang.RuntimeException: Unknown
kinesis failure, when trying to reach kinesis
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:223)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 153748225435
> org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:206)
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
> org.joda.time.Minutes.minutesBetween(Minutes.java:101)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:163)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:205)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message