incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Kambatla <kkamb...@cs.purdue.edu>
Subject Re: Field - Key - in Event?
Date Sat, 15 Oct 2011 05:38:56 GMT
Thanks for the clear explanation, Leo. It all makes perfect sense now.

I have another question now (yes, please bear with me). Given Streams form
the connection between two PEs, wouldn't it be nice to expose the Stream to
the underlying communication layer? That way, we might be able to support
higher throughput through batched sends whenever possible. All events on a
waiting Stream can be batched while waiting for the NIC busy on other
Streams.

Thanks
Karthik

On Fri, Oct 14, 2011 at 6:16 PM, Leo Neumeyer <leoneumeyer@gmail.com> wrote:

> This is for S4 piper (future S4 v0.5).
>
> From the example:
>
> https://github.com/leoneu/s4-piper/blob/master/subprojects/s4-example/src/main/java/io/s4/example/counter/MyApp.java
>
> The basic elements of an app are:
>
> Stream
> ProcessingElement
> KeyFinder
> Event
>
> + Create a stream:
>
> Stream<CountEvent> userCountStream = createStream("User Count Stream",
>                new CountKeyFinder(), printPE);
>
> where:
>
> * CountEvent and its subtypes are the is only type of objects that can
> be sent in this stream.
> * CountKeyFinder is a function that returns the value of the key in a
> CountEvent.
> * "User Count Stream" is the name of the stream. (very useful to
> identify the threads).
> * printPE is the target PE.
>
> + Create a PE:
>
> CounterPE userCountPE = createPE(CounterPE.class);
> userCountPE.setTrigger(Event.class, interval, 10l, TimeUnit.SECONDS);
> userCountPE.setCountStream(userCountStream);
>
> * To create a PE instance we provide the PE class as an argument.
> * We set some properties using setters.
> * In this case the userCountPE will put an event into userCountStream.
>
> We decided to put the Key in the stream because it is the connection
> between the source and the target. As you can see the KeyFinder is set
> when the app graph is created. Once the app starts processing data, we
> simply put an event into a stream. If we did it in the event, we would
> need to have the PE create a KeyFinder object every time it needs to
> send an event. Doing it in the stream it requires doing it only once.
> Events are lightweight immutable objects created at a very high rate,
> on the other hand, streams are created only once. That's why it made
> more sense to put it in the stream.
>
> Note that this can be extended in many ways:
>
> - Use a generic MapEvent where you cam have any attributes.
> - Use a generic MapKeyFinder to set the key.
>
> This generic/dynamic approach would be less efficient and doesn't have
> the advantages of static typing but may be useful for prototyping.
>
> - We can add a simple query language expressed as a string to extract
> the key from an event using reflection like what we do in v0.3.
>
> - We could use inner classes to define the key function. Might be
> worth looking at the latest Guava library to see if we can use it in
> some way or follow similar patterns. I just used the CacheBuilder in
> the ProcessingElement class.
>
> More ideas welcomed!
>
> -leo
>
>
>
>
> On Fri, Oct 14, 2011 at 1:57 PM, kishore g <g.kishore@gmail.com> wrote:
> > Yes, a event might be dispatched on multiple keys or it may be keyless.
> If
> > we put the key in the event, then s4 framework some how needs to extract
> > those keys from the event which means we have to enforce that event class
> > always implements a base class provided by s4. This may not be desirable
> in
> > all cases. For example the events may be generated by a system outside s4
> > and has its own format/class.
> >
> > The other approach is to have EventWrapper which has streamName, key(s)
> and
> > event. This will avoid instrospection on the event.
> >
> > Also having key in event will sort of tie keys being known on the sending
> > side where in some cases we need it to be created on the receiving side.
> >
> > Having said that, having key in the event definitely helps avoid key and
> > keyFinder. But the hope is the we can have a generic KeyFinder which can
> > extract a key from a pojo which is the class most of the time.
> >
> > thanks,
> > Kishore G
> >
> > On Fri, Oct 14, 2011 at 12:40 PM, Leo Neumeyer <leoneumeyer@gmail.com
> >wrote:
> >
> >> Remember that an event may be dispatched with several keys. That's why
> we
> >> tie the key to the stream which delivers to a specific PE prototype. Let
> me
> >> think a bit more.
> >>
> >> -leo
> >>
> >>
> >> On Oct 14, 2011, at 12:03, Karthik Kambatla <kkambatl@cs.purdue.edu>
> >> wrote:
> >>
> >> > Also, if we make Key a first class member of Event, do we really need
> >> other
> >> > classes - Key and KeyFinder - to determine the value of key of a
> >> particular
> >> > event?
> >> >
> >> > Thanks
> >> > Karthik
> >> >
> >> > On Fri, Oct 14, 2011 at 1:01 PM, Karthik Kambatla <
> >> kkambatl@cs.purdue.edu>wrote:
> >> >
> >> >> Hello
> >> >>
> >> >> Given that every event has an associated key (e.g. CountEvent.key),
> >> >> wouldn't it make sense to add it to the Event class itself?
> >> >>
> >> >> The key in every Event, can be used directly for routing decisions.
> >> Also, I
> >> >> believe it will prove to be very handy if people consider each PE
> >> instance
> >> >> serving multiple keys (for scalability) or want to write their own
> >> >> extensions for persisting the events.
> >> >>
> >> >> Implementation: To facilitate different types of keys (Strings, Ints
> >> etc.),
> >> >> we might want to make Event generic - Event<Key> with certain
> properties
> >> on
> >> >> key.
> >> >>
> >> >> Thanks
> >> >> Karthik
> >> >>
> >>
> >
>
>
>
> --
>
> -leo
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message