apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Atri Sharma <atri.j...@gmail.com>
Subject Re: Lineage support on apex
Date Mon, 25 Apr 2016 14:34:13 GMT
+1

I like this feature

On Mon, Apr 25, 2016 at 7:52 PM, Amol Kekre <amol@datatorrent.com> wrote:

> This is very valuable. I have heard the following feature sets from
> customers.
>
> - Ability to spool to hdfs (or any DFS interface)
> - Ability to pick and choose the tuple, i.e. not every tuple may need to be
> tracked
> - Minimal performance hit
> - The current api remains as is
> - Ability to get the content based on tuple-id
>
> Apex should enable this with minimal or no coding from users
>
> Thks,
> Amol
>
>
> On Mon, Apr 25, 2016 at 12:00 AM, Ashwin Chandra Putta <
> ashwinchandrap@gmail.com> wrote:
>
> > Hi All,
> >
> > I have heard of a few use cases where lineage support is asked for. On
> > apex, it seems to be an ask for the ability to uniquely track each tuple
> as
> > it flows through the DAG. It further boils down to being able to track
> > every tuple going into each operator and the corresponding tuple going
> out
> > of the operator. Here are a quick list I put together to describe some
> > requirements for lineage support on apex. Please feel free to improve or
> > add to it. Also, please respond with ideas on how we can solve this on
> the
> > apex platform.
> >
> > When lineage is enabled,
> > 1. We should be able to track each tuple as it enters and exits an
> > operator. eg: enrichment.
> > 2. We should be able to track all the tuples that contributed to a tuple
> > that is emitted. eg: dimensions computation.
> > 3. We should be able to track all the tuples that contributed to all the
> > tuples emitted by the operator. eg: join?
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>



-- 
Regards,

Atri
*l'apprenant*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message