apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "York, Brennon" <Brennon.Y...@capitalone.com>
Subject RE: Simple Operators within Malhar (MLHR-1914)
Date Thu, 10 Dec 2015 06:44:59 GMT
I see the goals as twofold.

First, to abstract away what an app developer needs to write to be successful (ie input and
output ports) and to provide a common interface for accessing such (ie all input and output
ports and titled "input" and "output" respectively).

Second, to use these sets of operator processing primitives (ie one to one, one to many, many
to one, and many to many) to build a suite of functional operators such as 'map', 'reduce',
'groupBy', etc. to, again, abstract away what is necessary for the developer to write a solid
apex application.

I see this benefitting the community as a whole in that it allows Apex to build higher level
tools and operators to ease the burden off the application developer. It is great that someone
can define their own input and output ports, but is it necessary in 90% of the cases? I know
personally from applications we've built that having these design patterns makes it easier
as we internally developed versions of SingleInputOutput and SingleInputMultiOutput exactly
for that reason.

Does that answer the question and/or clear things up? Happy to discuss further :)



-----Original Message-----
From: Thomas Weise [thomas@datatorrent.com<mailto:thomas@datatorrent.com>]
Sent: Wednesday, December 09, 2015 07:30 PM Eastern Standard Time
To: dev@apex.incubator.apache.org
Subject: Re: Simple Operators within Malhar (MLHR-1914)


Hi Brennon,

What is the goal here? Make it easier for someone to build an application
or make it easier to write an operator or both? Is this for custom operator
development?

Siyuan was also looking at the higher level API, from an application
developer's perspective.

For the application developer, it should not matter how the operators were
written, how they are connected should be hidden by the API. That will be
important since we already have many operators that we want to reuse, such
as join or the adapters.

Thomas


On Wed, Dec 9, 2015 at 1:42 PM, York, Brennon <Brennon.York@capitalone.com>
wrote:

> All, I’ve been working on the JIRA ticket MLHR-1914 (at
> https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues)
> and I wanted to shoot this out to describe what I’ve been doing and get
> feedback now that its in a state of something that we can discuss ;)
>
> Before going into depth here is the code on my local repo:
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
> The tests are in the same respective test directory.
>
> So, the biggest impetus for this JIRA is that there should be a set of
> operators that 1. standardize the input and output ports and 2. make it
> very simple for a developer to merely implement a process method and forget
> the rest. Given all of this I found that there were two sets of operators
> based on the complexity of ports and how they mapped to each other. I gave
> them the package names ‘simple’ and ‘complex’ for lack of a better idea at
> the time. Feel free to propose something better :)
>
> Under ‘simple’ are three operators:
>
>  *   SingleInputOutput: This abstracts the input and output port (defined
> as ‘input’ and ‘output’) and merely allows a user to implement a process
> method.
>  *   SingleInputMultiOutput: Like above, but the return value from the
> ‘process’ method is emitted to N output ports where N defaults to 2.
>  *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’
> method with a single output port with N defaulting to 2.
>
> Under ‘complex’ are four operators:
>
>  *   SingleInputListOutput: a single input port and ‘process’ method where
> the return value of the ‘process’ method is a list of values with each
> value in the array matching the N output ports with N defaulting to 2.
>  *   DirectMultiInputOutput: This maps N inputs to N outputs processed
> under a single ‘process’ method with N defaulting to 2.
>  *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for
> each input the ‘process’ method is called and, with the return value of the
> process method, it is sent to each of the M output ports with M and N
> defaulting to 2.
>  *   AllWayMultiInputListOutput: like above except that, instead of having
> the ‘process’ method return value emit to each of the M output ports, the
> return value from ‘process’ is a list with each element in the list
> emitting to a different output port. Concretely, v[0] => O[0], v[1] =>
> O[1], etc. where v[] is the array of values from the ‘process’ method and
> O[] is the array of output ports.
>
> Like I said I’m still working through the test and error cases (say where
> v[].len != O[].len) although I’d love to get feedback on everything thus
> far! Also, forgot to mention above, but this work is heavily related and
> will be the base of MLHR-1915 whereby we can build higher level operators
> such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
> ________________________________________________________
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One
and/or its affiliates and may only be used solely in performance of work or services for Capital
One. The information transmitted herewith is intended only for use by the individual or entity
to which it is addressed. If the reader of this message is not the intended recipient, you
are hereby notified that any review, retransmission, dissemination, distribution, copying
or other use of, or taking of any action in reliance upon this information is strictly prohibited.
If you have received this communication in error, please contact the sender and delete the
material from your computer.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message