flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Qin ...@uber.com>
Subject Re: expose side output stream
Date Sat, 13 Aug 2016 18:40:14 GMT
Stephan & Aljoscha,

What I did was a API hack without much thoughts into architect at beginning
:-)
patch attached, I think it would be "tricky" to achieve backward
compatibility with current architect.
We might need to encapsulate "Collectors" to ProcessContext as starting
point.

I would try to work on a FLIP next week.

Best Regards,
Chen


On Fri, Aug 12, 2016 at 2:13 AM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> Hi Chen,
> could you maybe share the code that you have so far?
>
> If you wan't you can start a google doc and then we can work together on
> fleshing out an API/implementation that we can present to the Flink
> community as a FLIP.
>
> Cheers,
> Aljoscha
>
> On Thu, 11 Aug 2016 at 14:40 Stephan Ewen <sewen@apache.org> wrote:
>
> > Hi!
> >
> > This is a very big change, both on the semantics, the runtime classes.
> > These changes are tricky to get in, and usually work best if you document
> > the changes and all implications well.
> >
> > Something like a deep design doc, or a FLIP would be great for this.
> >
> > https://cwiki.apache.org/confluence/display/FLINK/
> Flink+Improvement+Proposals
> >
> > Greetings,
> > Stephan
> >
> >
> > On Thu, Aug 11, 2016 at 12:41 AM, Chen Qin <qinnchen@gmail.com> wrote:
> >
> > > Hi there,
> > >
> > > I am thinking of implement sideOutput into Flink which seems missing
> > > support.
> > > https://cloud.google.com/dataflow/model/par-do#side-outputs
> > >
> > > It is useful because it will help pipeline author redirect corrputed
> > input/
> > > code bug to a side stream or write to a table and reconsile afterwards.
> > >
> > > After some hack prototyping, I were able to get it works for simple
> > tests.
> > > Basically, It allows env to register a side output typeInfo which will
> be
> > > passed to configurations during graph building; Adding a new transform
> > > which similar to selection transform but holding different input type;
> > > StreamEdge will has a boolean to see if that is side output edge, if
> so,
> > > create output writer loads side output type serializer and emit record
> > only
> > > when sideOutput is called.
> > >
> > > I have some problem passing side output type as template to each data
> > > stream. It means it will have to expose any output stream with two type
> > > parameters. As you can imagine, the API interface change will be
> sizable.
> > >
> > > Any suggestion?
> > >
> > > Chen
> > >
> >
>



-- 
-Chen Qin

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message