incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Neumeyer <leoneume...@gmail.com>
Subject Re: Setting App id
Date Sat, 29 Oct 2011 20:10:03 GMT
Good points. See below.

On Sat, Oct 29, 2011 at 10:44 AM, Matthieu Morel <mmorel@apache.org> wrote:
> On Thu Oct 27 22:09:16 2011, Leo Neumeyer wrote:
>>
>> We need to decide how to assign unique app ids to loaded apps. App Id is
>> an int.
>
> Is int a requirement or an optimization?
>

Yes. The appid is written in every event so we should not use more
than 4 bytes. Here are some thoughts after reading your comments.

- Let's change the name to something like runtimeId. This id is set by
the deployment process and is unique for each cluster. The runtimeId
is reclaimed when the app is unloaded.

- The RuntimeTable saves a record that includes the following properties:
  . Description - helpful when listing live apps.
  . Fully qualified class name of the App class.
  . userId (in a multi-tenant system)
  . Inter-app dependencies.
  . we keep adding more attributes over time.

- To create inter-app dependencies S4 needs get the runtimeId from the
RuntimeTable (using ZK of course) Let's say I deploy an app of class
AntispamApp and need to consume web clicks produced by an app of class
WebApp. Can we specify the dependency before deployment? I can provide
some information along with my AntispamApp.s4r file such as the class
name of my source: WebApp. If there is only one instance of WebApp and
I am authorized to consume, then there is no ambiguity and we can
deploy by looking up the runtimeId for the only instance of WebApp. If
there are multiple instances I can specify the userId, and other
properties I expect WebApp instance to have. If after checking all the
properties, we still have ambiguity, the deployment fails. This seems
like a pretty flexible scheme. In most cases the app class name will
probably suffice to uniquely identify the live app. We can easily add
more properties to the RuntimeTable as needed.

- The s4 archive should probably be generic, that is no deployment
specific properties such as userId and dependent apps should be
included. (As Matthieu argues below) On the other hand, how do we
deploy the app? We could add a descriptor file (not nice to have 2
files though), add attributes to the s4r before deployment (this adds
a step and transforms the file which is confusing and error prone),
any other options?

>>
>> Here are some thoughts.
>>
>> Not needed in local mode.
>>
>> Can we deploy the same s4r file for more than one owner or with diff
>> inputs?
>
> Why not? But this also adds the concept of tenancy or ownership to S4
> applications right?

yes, let's assume that userId and some other property are used to
identify across tenants.

>>
>> If yes maybe some manifest properties should be set at deploy time. I
>> think this is a requirement because S4r may have different input streams but
>> otherwise apps can be identical.
>>
>> Should the id be set by deployer tool by setting a manifest property?
>
> Not sure the best way is to set a "manifest" property though, since this is
> part of the s4r package. Rather, the platform could compute an id for the
> app, and this id could stay in zookeeper, along with other runtime
> parameters for the application.
> We'll probably need a way to get the app id from the owner and app
> characteristics (name...).
>>
>>
>> How do we configure event source? By deployer tool or post deployment. We
>> should have all the info before deployment so it might make sense to do it
>> before and wiring is done during init. (if dependencies are not available,
>> the app will fail to start).
>
> Can you give more information about what you define as event source?
>

org.apache.s4.core.EventSource is an API that implements Streamable.
It is designed to allow apps to subscribe their streams to an event
source at runtime.

An app that wants to publish a stream must create an EventSource
object. Apps that want to consume the events must subscribe streams to
the EventSource.

In the previous example:

. WebApp must create an EventSource object.
. WebApp must call eventSource.put(Event e) to publish an event.
. AntispamApp is required to get events from a class of type WebApp
. Because there is only one instance of WebApp, we easily find the
runtimeId for WebApp via ZK in the RuntimeTable.
. AntispamApp object asks Server singleton to provide the reference to
WebApp (look by runtimeId, not implemented yet)
. AntispamApp gets eventSource reference from webApp and calls
eventSource.subscribeStream(clickStream)

To deal with an App that has multiple EventSource objects, the app
could implement getEventSource(Class eventClass) to return the
EventSource object that produces events of a specific type (the
downside is that this prevents having multiple sources using the same
eventType). The advantage of this approach is that  AntispamApp
doesn't need to know WebApp at compile time.

>
> Thanks!
>
> Matthieu
>



-- 

Leo Neumeyer (@leoneu)

Mime
View raw message