beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davor Bonaci (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1126) Expose UnboundedSource split backlog in number of events
Date Sun, 11 Dec 2016 21:08:58 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15740357#comment-15740357
] 

Davor Bonaci commented on BEAM-1126:
------------------------------------

Interesting perspective, thanks [~aviemzur].

I think the primary design goal of the current API was to enable dynamic optimizations, as
opposed to monitoring scenarios. The general idea was that the source should provide an indication
of amount of pending work, and it was probably thought that the size in bytes better correlates
to "work" than the size in terms of number of elements. Basically, it was intended that the
consumer of the data is the runner, not the user.

That said, monitoring scenarios are possibly even more important. I think the idea there was
that the source should publish monitoring metrics directly thought Beam abstractions in a
runner-independent way. Then, all runners would get this benefit, with no particular work
required, in a metric that makes sense for that source. (However, I don't think a source can
do this today -- but this could a different approach for the same problem.)

Anyways, I'm sure [~dhalperi@google.com] will comment more ;-)

> Expose UnboundedSource split backlog in number of events
> --------------------------------------------------------
>
>                 Key: BEAM-1126
>                 URL: https://issues.apache.org/jira/browse/BEAM-1126
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Aviem Zur
>            Assignee: Daniel Halperin
>            Priority: Minor
>
> Today {{UnboundedSource}} exposes split backlog in bytes via {{getSplitBacklogBytes()}}
> There is value in exposing backlog in number of events as well, since this number can
be more human comprehensible than bytes. something like {{getSplitBacklogEvents()}} or {{getSplitBacklogCount()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message