beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aviem Zur (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-2812) Dropped windows counters / log prints no longer working
Date Mon, 28 Aug 2017 07:54:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aviem Zur updated BEAM-2812:
----------------------------
    Description: 
In https://github.com/apache/beam/pull/2838 aggregators were removed from Spark runner, this
caused regression around dropped windows counters and logs.

{{CounterCell}} instances are created ad hoc instead of using the {{Metrics}} class static
factory methods: [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
Context of where the metrics are reported isn't taken into account, and since these counters
are being passed to a lazily evaluated iterator [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223]
the subsequent code which looks at the counters is always looking at these counters immediately
after initialization, before they are populated, so these prints will never happen since the
conditional statements do not check on the right counters [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].
What we want is these counts exposed as metrics as well as logs.

Additionally, {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes
a {{CounterCell}} as a parameter, which is a class for metrics implementation and should generally
not be used elsewhere (this is also mentioned in its Javadoc), we should look into changing
this method to use something else and perhaps make {{CounterCell}} and similar classes package
private (And change runner code which uses these to be in the same package).

  was:
In https://github.com/apache/beam/pull/2838 aggregators were removed from Spark runner, this
caused regression around dropped windows counters and logs.

{{CounterCell}} instances are created ad hoc instead of using the {{Metrics}} class static
factory methods: [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
Context of where the metrics are reported isn't taken into account, and since these counters
are being passed to a lazily evaluated iterator [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223]
the subsequent code which looks at the counters is always looking at these counters immediately
after initialization, before they are populated, so these prints will never happen since the
conditional statements do not check on the right counters [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].

Additionally, {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes
a {{CounterCell}} as a parameter, which is a class for metrics implementation and should generally
not be used elsewhere (this is also mentioned in its Javadoc), we should look into changing
this method to use something else and perhaps make {{CounterCell}} and similar classes package
private (And change runner code which uses these to be in the same package).


> Dropped windows counters / log prints no longer working
> -------------------------------------------------------
>
>                 Key: BEAM-2812
>                 URL: https://issues.apache.org/jira/browse/BEAM-2812
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Aviem Zur
>            Assignee: Amit Sela
>
> In https://github.com/apache/beam/pull/2838 aggregators were removed from Spark runner,
this caused regression around dropped windows counters and logs.
> {{CounterCell}} instances are created ad hoc instead of using the {{Metrics}} class static
factory methods: [SparkGroupAlsoByWindowViaWindowSet.java#L213-L219|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L213-L219]
> Context of where the metrics are reported isn't taken into account, and since these counters
are being passed to a lazily evaluated iterator [SparkGroupAlsoByWindowViaWindowSet.java#L221-L223|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L221-L223]
the subsequent code which looks at the counters is always looking at these counters immediately
after initialization, before they are populated, so these prints will never happen since the
conditional statements do not check on the right counters [SparkGroupAlsoByWindowViaWindowSet.java#L323-L333|https://github.com/apache/beam/blob/v2.1.0/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L323-L333].
> What we want is these counts exposed as metrics as well as logs.
> Additionally, {{org.apache.beam.runners.core.LateDataUtils#dropExpiredWindows}} now takes
a {{CounterCell}} as a parameter, which is a class for metrics implementation and should generally
not be used elsewhere (this is also mentioned in its Javadoc), we should look into changing
this method to use something else and perhaps make {{CounterCell}} and similar classes package
private (And change runner code which uses these to be in the same package).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message