beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Etienne Chauchot (JIRA)" <>
Subject [jira] [Created] (BEAM-4825) Nexmark query3 is flaky on Direct runner
Date Thu, 19 Jul 2018 10:39:00 GMT
Etienne Chauchot created BEAM-4825:

             Summary: Nexmark query3 is flaky on Direct runner
                 Key: BEAM-4825
             Project: Beam
          Issue Type: Bug
          Components: runner-direct
            Reporter: Etienne Chauchot
            Assignee: Henning Rohde

Query3 exercises state and timers. It asks this question to Nexmark auction system:
Who is selling in particular US states?

And the sketch of its code is:
 * Apply global window to events with trigger repeatedly after at least nbEvents in pane =>
 results will be materialized each time nbEvents are received.
 * input1: collection of auctions events filtered by category and keyed by seller id
 * input2: collection of persons events filtered by US state codes and keyed by person id
 * CoGroupByKey to group auctions and persons by personId/sellerId + tags to distinguish persons
and auctions
 * ParDo to do the incremental join: auctions and person events can arrive out of order
 * person element stored in persistent state in order to match future auctions by that person.
Set a timer to clear the person state after a TTL
 * auction elements stored in persistent state until we have seen the corresponding person
record. Then, it can be output and cleared

 * output NameCityStateId(,, person.state, objects
*The output size should be constant and it is not.*
*The output size of this query is different between batch and streaming modes*
See query 3 dashboard in this graph:

This message was sent by Atlassian JIRA

View raw message