beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frank Yellin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-536) Aggregator.py. More misleading documentation. More bad documentation
Date Sat, 06 Aug 2016 00:06:20 GMT
Frank Yellin created BEAM-536:
---------------------------------

             Summary: Aggregator.py.  More misleading documentation.  More bad documentation
                 Key: BEAM-536
                 URL: https://issues.apache.org/jira/browse/BEAM-536
             Project: Beam
          Issue Type: Bug
            Reporter: Frank Yellin
            Priority: Minor


The last paragraph of the documentation for Aggregator is:

You can also query the combined value(s) of an aggregator by calling
aggregated_value() or aggregated_values() on the result object returned after
running a pipeline.

There are multiple problems in this one sentence!

#1) There is no such method aggregated_value() that I can find anywhere.

#2) DirectRunner implements aggregated_values(), but DirectPipelineRunner does not.  The latter
is the far more interesting case.

#3) When I use a BlockingDirectPipelineRunner and ask for its aggregated_values(), I get an
error message indicating that this is not implemented in DirectPipelineRunner.  Very confusing
since I never asked for a DirectPipelineRunner.

It is clear that this is because BlockingDirectPipelineRunner is a method rather than a class.
 Is this really the right thing?  Will there be other confusing error messages.

#4) The documentation for aggregated_values() says "returns a dict of step names to values
of the aggregator."  I have no idea what a "step" means in this context.  In practice, it
seems to be a single-element dictionary whose key is 'user--' prefixed onto the aggregator
name.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message