incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthieu Morel (Commented) (JIRA)" <>
Subject [jira] [Commented] (S4-22) Adaptor
Date Tue, 06 Dec 2011 15:59:39 GMT


Matthieu Morel commented on S4-22:

I don't understand the intended limitation on the symmetry between cluster. 
My alternative view is that we have symmetry of apps within a cluster but not between clusters.

* Example:
To fuel the discussion, here is an example of a typical setup involving streams, logical cluster,
apps and physical nodes:

- Within the same logical cluster, orange and red apps communicate through the gray stream

PEs from the red app send data on the grey stream according to a partitioning function that
takes into account the number of nodes in the target cluster (in this case, the same cluster).

- Between clusters, green app communicates with orange app through the red stream. There is
no symmetry between those apps, because they belong to logical clusters of different sizes,
but we use the same mechanism for partitioning data (that info needs to be derivable from
the stream configuration).

* How does the adaptor fit there?

1. If the adaptor is an S4 app, simply configure the adaptor on a specific logical cluster.
It can be constituted of 1 node, that would be the case in our twitter example. But it may
also be constitued of several nodes, when possible, in order to distribute load and processing.
This is the same approach than in S4 pre-0.5 (app cluster and adapter cluster).

2. send S4 events on a stream. Events will be distributed to the apps registered to that stream
on the target cluster.

* Bottom line: I don't think we need symmetric between clusters, but we must be able to retrieve
or rebuild a partitioning scheme from the strem configuration.

> Adaptor
> -------
>                 Key: S4-22
>                 URL:
>             Project: Apache S4
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Leo Neumeyer
>            Assignee: Bruce Robbins
>             Fix For: 0.5
>         Attachments: s4-subclusters.pdf
> Need an adaptor for v0.5
> Idea I posted earlier:
> What do you think of this idea for a simple adaptor:
> - Adaptor extends App
> - Adaptor can send events but not receive (for now)
> - Adaptor is deployed as a regular App to the S4 cluster and as an
> Adaptor type  in a host (separate from the S4 cluster).
> - Adaptor, unlike regular apps, can accept event data (in any format)
> directly, not via comm layer.
> - Input data is transformed into S4 events using a modular approach
> and by providing standard modules such as JSON.
> - Output events are exposed using EventSource and consumed by other
> apps without even knowing that they are Adaptors (only the App type is
> exposed in the cluster).
> - S4 events can be processed locally using PEs and Streams as usual.
> (We kind of need to get a local Sender for the local PEs and a
> standard cluster Sender for the EventSource object.)
> So why this approach?
> The GOOD:
> - Seems to be the least disruptive way to inject external events
> - Apps can easily consume the events in a modular way without any
> dependencies. Getting events from an adaptor or from another app is
> identical.
> - The adaptor would be packaged and deployed to the cluster as if it
> was an App (no incremental cost)
> - The adaptor can do preprocessing using the same programming model
> and can reuse PEs.
> -  We need to also deploy the Adaptor in a separate host. On the other
> hand, this is inevitable. At least we use the same approach instead of
> creating a different system.
> -  The Adaptor will need to be integrated with ZK to get the physical addresses.
> -  We need to deal with two senders.
> for later: two-way communication and adapter clusters.
> thoughts?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message