beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [4/8] incubator-beam-site git commit: Add Design Principles (take from the original Beam technical vision document).
Date Wed, 19 Oct 2016 04:06:33 GMT
Add Design Principles (take from the original Beam technical vision document).


Branch: refs/heads/asf-site
Commit: 997834188ecf29b307e195c9c7e8d31fa60b34ff
Parents: 7f234a5
Author: Frances Perry <>
Authored: Mon Oct 3 19:00:03 2016 -0700
Committer: Frances Perry <>
Committed: Tue Oct 18 20:56:39 2016 -0700

 _includes/header.html           |  5 ++--
 contribute/ | 53 ++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+), 2 deletions(-)
diff --git a/_includes/header.html b/_includes/header.html
index 182b30a..67631a9 100644
--- a/_includes/header.html
+++ b/_includes/header.html
@@ -63,12 +63,13 @@
 			  <li role="separator" class="divider"></li>
 			  <li class="dropdown-header">Basics</li>
 			  <li><a href="{{ site.baseurl }}/contribute/contribution-guide/">Contribution
-			  <li><a href="{{ site.baseurl }}/contribute/testing/">Testing</a></li>
 			  <li><a href="{{ site.baseurl }}/use/mailing-lists/">Mailing Lists</a></li>
               <li><a href="{{ site.baseurl }}/contribute/source-repository/">Source
               <li><a href="{{ site.baseurl }}/use/issue-tracking/">Issue Tracking</a></li>
               <li role="separator" class="divider"></li>
-			  <li class="dropdown-header">Technical Resources</li>
+			  <li class="dropdown-header">Technical References</list>
+			  <li><a href="{{ site.baseurl }}/contribute/testing/">Testing</a></li>
+              <li><a href="{{ site.baseurl }}/contribute/design-principles/">Design
 			  <li><a href="">Technical Vision</a></li>
diff --git a/contribute/ b/contribute/
new file mode 100644
index 0000000..87ddd24
--- /dev/null
+++ b/contribute/
@@ -0,0 +1,53 @@
+layout: default
+title: 'Design Principles in Beam'
+permalink: /contribute/design-principles/
+# Design Principles in the Apache Beam Project
+Joshua Bloch’s [API Design Bumper Stickers](
are a great list of what makes for good API design. In addition, we have specific design principles
we follow in Beam.
+* TOC
+## Use cases
+### Unify the model
+Provide one model that works over both bounded (aka. batch) and unbounded (aka. streaming)
datasets. Pay special attention to windows / triggers / state / timers, which often trip up
folks used to a batch world.  Provide users with the right abstractions to adjust latency
and completeness guarantees to cover both traditional batch and streaming use cases. 
+### Separate data shapes and runtime requirements
+The model should focus on letting users describe their data and processing, without exposing
any details of a specific runtime system. For example, bounded and unbounded describe the
shape of data, but batch and streaming describe the behavior of specific runtime systems.
Good test cases are to imagine a mythical micro-batching runner that sits somewhere between
batch and streaming or a engine that dynamically switches between streaming and batch depending
on the backlog.
+### Make efficient things easy, rather than make easy things efficient
+Don’t prevent efficiency for ease of use. Design APIs that provide the information necessary
for efficiently executing at scale. Provide class hierarchies and wrappers to make the common
cases simpler.
+## Usability
+### Validate Early
+Validate constraints on graph shape, runner requirements, etc as early in the compile time
- construction time - submission time - execution time spectrum as reasonably possible in
order to provide a smoother user experience.
+### Public APIs, like diamonds, are forever (at least until the next major version)
+Backwards incompatible changes can only be made in the next major version. Because of the
burden major versions place on users (code has to be modified, conflicting dependency nightmares,
etc), we aim to do this infrequently. Clearly mark APIs that are considered experimental (may
change at any point) and deprecated (will be removed in the next major version). Consider
what APIs are more amenable to future changes (abstract classes vs. interfaces, etc.)
+### Examples should be pedagogical
+Canonical examples help people ingrain the principles. Design examples that teach complex
concepts in modular chunks. If you can’t explain the concept easily, then the API isn’t
right. Examples should withstand random copy-pasting. 
+## Extensibility
+### Use PTransforms for modularity
+Composite transformations (transformations formed by a subgraph of other transformations)
are treated as first class objects. They can be named and applied directly in any pipeline
to nicely encapsulate concepts. This removes the artificial separation between those built
into PCollection and those provided by users. In addition, PTransforms can be used as a clear
concept in graphical monitoring and provide a way to scope metadata like aggregators, logging,
and resources. Use these when building pipelines.
+### Keep Beam SDKs consistent
+Beam SDKs should expose the complete set of concepts in the programming model. They should
all use the same set of abstractions and be able to share conceptual documentation.
+### When in ~~Rome~~ Python, do as the ~~Romans~~ Pythonians do
+Each SDK must feel right to those who live and breath that language. Adapt the general Beam
concepts into language-dependent styles when the benefits clearly outweigh the drawbacks.
+### Encourage DSLs  
+Many use cases or user communities can be served by provided ‘wrapper’ SDKs that provide
a simpler or domain-specific set of abstractions that then build on a Beam SDK and take advantage
of Beam Runners.
+### Design for the model, not specific runners
+The Beam APIs should serve all runners. Behind every runner-specific hook, there is a general
principle in the model. Design APIs that generalize across multiple runners.

View raw message