beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Sisk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1025) User guide - "How to create Beam IO Transforms"
Date Sun, 26 Feb 2017 19:12:46 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884871#comment-15884871
] 

Stephen Sisk commented on BEAM-1025:
------------------------------------

This has sat around a while and I'd like to get a first version up on the site. 

My plan right now is to combine the two docs I've got:
https://docs.google.com/document/d/1nGGP2sLb5fLamB_dnkHVHC8BVjDD_SE46mQPIPkK5cQ/edit#
https://docs.google.com/document/d/153J9jPQhMCNi_eBzJfhAg-NprQ7vbf1jNVRgdqeEE8I/edit#

I'll put the combined contents into a separate guide called "Authoring & Testing IO Guide"
that would live alongside the Testing Guide/PTransform Style Guide

The idea for this being not part of the IO programming guide or the Testing Guide is that
this is not something that we expect normal users will be doing, and a lot of the advice we
have is for how to make IO transforms re-usable, so it belongs more in the "Contribute" section.

I'm working on a PR for this now, hope to have that sent out by mid-week.

> User guide - "How to create Beam IO Transforms"
> -----------------------------------------------
>
>                 Key: BEAM-1025
>                 URL: https://issues.apache.org/jira/browse/BEAM-1025
>             Project: Beam
>          Issue Type: Task
>          Components: website
>            Reporter: Stephen Sisk
>            Assignee: Stephen Sisk
>
> Beam has javadocs for how to create a read or write transform, but no friendly user guide
on how to get started using BoundedSource/BoundedReader.
> This should cover:
> * background on beam's source/sink API design 
> * design patterns
> * evaluating different data sources (eg, what are the properties of a pub sub system
that affect how you should write your UnboundedSource? What is the best design for reading
from a NoSql style source?)
> * testing - how to write unit, integration (and once we have them, performance tests)
> * public API recommendations
> This is related, but not strictly overlapping with: 
> https://issues.apache.org/jira/browse/BEAM-193
> - the Dataflow SDK documentation for "Custom Sources and Sinks"  contains some info about
writing Sources/Sinks, but it is somewhat out of date, and doesn't reflect the things we've
learned recently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message