incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leo Neumeyer (Commented) (JIRA)" <>
Subject [jira] [Commented] (S4-22) Adaptor
Date Tue, 29 Nov 2011 21:13:40 GMT


Leo Neumeyer commented on S4-22:

Subcluster definition: group of nodes over which we distribute an S4 application AND we establish
dependencies between applications across clusters. (If there was no dependency, the subclusters
would be just plain separate clusters.)

The key here is how to establish inter-app communication across clusters. An EventSource (ES)
API is made available in a Twitter preprocessing cluster. When configured, the app that hosts
the ES needs to know how to talk to, for example, SmatApp app hosted on a different subcluster.
If nodes were symmetric, this can be done very easily without introducing any changes to the
current API and to the apps. In fact, the app developer shouldn't have to know what the final
configuration will be (single node, one clusters, 2 subclusters, etc.)

If nodes were not symmetric across clusters, we would have to have a different approach, that's
why Bruce and I concluded that a fully symmetric cluster was a better approach. There is very
little downside and it's simpler, all nodes are still identical. In the future we could relax
this requirements but it doesn't seem worth it to complicate things now. 

The main challenge now is to support multiple senders, one for each subcluster. Events would
be handed to each sender and sender will decide if the event must be transmitted and to what

> Adaptor
> -------
>                 Key: S4-22
>                 URL:
>             Project: Apache S4
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Leo Neumeyer
>            Assignee: Bruce Robbins
>             Fix For: 0.5
>         Attachments: s4-subclusters.pdf
> Need an adaptor for v0.5
> Idea I posted earlier:
> What do you think of this idea for a simple adaptor:
> - Adaptor extends App
> - Adaptor can send events but not receive (for now)
> - Adaptor is deployed as a regular App to the S4 cluster and as an
> Adaptor type  in a host (separate from the S4 cluster).
> - Adaptor, unlike regular apps, can accept event data (in any format)
> directly, not via comm layer.
> - Input data is transformed into S4 events using a modular approach
> and by providing standard modules such as JSON.
> - Output events are exposed using EventSource and consumed by other
> apps without even knowing that they are Adaptors (only the App type is
> exposed in the cluster).
> - S4 events can be processed locally using PEs and Streams as usual.
> (We kind of need to get a local Sender for the local PEs and a
> standard cluster Sender for the EventSource object.)
> So why this approach?
> The GOOD:
> - Seems to be the least disruptive way to inject external events
> - Apps can easily consume the events in a modular way without any
> dependencies. Getting events from an adaptor or from another app is
> identical.
> - The adaptor would be packaged and deployed to the cluster as if it
> was an App (no incremental cost)
> - The adaptor can do preprocessing using the same programming model
> and can reuse PEs.
> -  We need to also deploy the Adaptor in a separate host. On the other
> hand, this is inevitable. At least we use the same approach instead of
> creating a different system.
> -  The Adaptor will need to be integrated with ZK to get the physical addresses.
> -  We need to deal with two senders.
> for later: two-way communication and adapter clusters.
> thoughts?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message