beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <>
Subject [jira] [Commented] (BEAM-735) PAssertStreaming should make sure the assertion happened.
Date Sun, 09 Oct 2016 21:01:21 GMT


Kenneth Knowles commented on BEAM-735:

At a high level, this already describes the relationship between {{PAssert}} and {{TestPipeline}}
today, since I did a minor rewrite of {{PAssert}}. I'd love to see if we can consolidate these.

> PAssertStreaming should make sure the assertion happened.
> ---------------------------------------------------------
>                 Key: BEAM-735
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
> The Spark runner currently runs PAsserts via `PAssertStreaming` which groups into a single
key and asserts the values on the worker (part of the "Lambda" in the Spark lingo).
> This could be a problem since Spark won't run if there's nothing to process - so that
if for some reason the input is missed, say reading from Kafka latest or simply an empty topic,
the assertion will be skipped and so we'll never fail (we would like to fail if there was
no input, while we expected one).
> This might change once Spark provide a better support for the Beam model in streaming,
but until then, it's best that our tests will consider this case as well.
> I'll add an aggregator and increment for assertion, at the end I'll make sure the aggregator
is not 0, so that at least one assertion took place (if for some reason Spark kept on for
a couple of more intervals it might execute the same assertion more then once, if the input
is repeated).  

This message was sent by Atlassian JIRA

View raw message