Mailing-List: contact esme-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: esme-dev@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of esjewett@gmail.com designates
 209.85.216.179 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=dHxrlo41qBcVbN1354LanvuKCazc7amH5ySoJAMut2WtQ7Ox4J1PyiyzY500qcAf3T
         DdhHlZb6lDwh+ZuP6Yf7Wf8mHQ9G/k2IqMlzoZrG7y3hNePr/b145C9GHa/3ALIqiyNK
         LVsZIQwmw/HudF9qcS0/CRAF888MiVebrO+18=
MIME-Version: 1.0
In-Reply-To: <2bca8c350912150606r7433a3afq9893ea4d6bd46ce@mail.gmail.com>
References: <68f4a0e80912131529m688e682cg8bc14b69b1e10d55@mail.gmail.com>
	 <fa2d9f450912142039q2c247fc5reb4c48fc81c4bed@mail.gmail.com>
	 <68f4a0e80912150534n4cc26e1fvd616f6e047e109c5@mail.gmail.com>
	 <fa2d9f450912150539j25aafd0dkc64170dc74b1d901@mail.gmail.com>
	 <68f4a0e80912150548ob8c3e09o25f871932047d2ec@mail.gmail.com>
	 <2bca8c350912150606r7433a3afq9893ea4d6bd46ce@mail.gmail.com>
Date: Tue, 15 Dec 2009 09:49:19 -0500
Message-ID: <68f4a0e80912150649s653f6969kfda0412ec2bbc613@mail.gmail.com>
Subject: Re: Streaming design for the api2 endpoint - request for comment
From: Ethan Jewett <esjewett@gmail.com>
To: esme-dev@incubator.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hmmm, that is very true. My idea with the tag streams was to return
all messages matching the tag, not just the ones from the user's
timeline. Kind of like a search.

Sounds like we need to talk about what exactly we want to stream a bit more=
 :-)

With regards to the filtering on the client side, I agree with you. In
general, clients will probably want to filter based on the main
timeline stream.

I wonder if there is any demand to be able to push arbitrary filters
back to the server side. This would be good because matching on the
server side is relatively easy and could potentially save sending a
lot of unwanted messages that the client would just discard. What do
you think?

Ethan

On Tue, Dec 15, 2009 at 9:06 AM, Daniel Koller <dakoller@googlemail.com> wr=
ote:
> I believe that the most important use case for the API will be that someb=
ody
> looks not for one criterium to match but for for more.
> (Something like: all messages w/ tag XYZ by agroup of defined accounts in
> the last 30 seconds) ...based on that I am not sure whether we need
> additional methods e.g. by tag.
>
> Because the listing by tag is just on e criterium, anyway a client would
> have to check the other criteria needed manually.
>
>
> On Tue, Dec 15, 2009 at 2:48 PM, Ethan Jewett <esjewett@gmail.com> wrote:
>
>> That's what I'm trying to avoid. If we want to stream tags we need to
>> maintain a separate stream for each session/tag combination because
>> otherwise the system won't be able to keep track of which messages
>> have been sent to which sessions.
>>
>> My proposal is to only create the tag stream for a session when that
>> tag stream is requested under the session. So if a client doesn't ask
>> for tag "someTagNoOneWantsToRead", then no stream/actor will be set
>> up.
>>
>> Ethan
>>
>> On Tue, Dec 15, 2009 at 8:39 AM, Richard Hirsch <hirsch.dick@gmail.com>
>> wrote:
>> > Then I misinterpreted the original mail. If every person had a stream
>> > for every tag, wouldn't that mean explosion of streams?
>> >
>> > On Tue, Dec 15, 2009 at 2:34 PM, Ethan Jewett <esjewett@gmail.com>
>> wrote:
>> >> Actually, having separate streams for each tag is what I'm suggesting=
,
>> >> I'm just trying to determine when best to create them. If a client
>> >> requests all the streams, they will all be created. Should we talk
>> >> about not streaming individual tags at all? Maybe put a limit on the
>> >> number of streams a client can have open? I'm not sure what the
>> >> performance impact will look like.
>> >>
>> >> Also, I need to amend the original email. I think that because we are
>> >> using Lift Sessions, we will be killing off the session and the
>> >> streams attached to it after a period of time. So I think option 2 an=
d
>> >> option 4 are the same or very similar.
>> >>
>> >> Ethan
>> >>
>> >> On Mon, Dec 14, 2009 at 11:39 PM, Richard Hirsch <hirsch.dick@gmail.c=
om>
>> wrote:
>> >>> Yes it sounds reasonable. I don't think it makes much sense to have
>> >>> separate streams for each tag, etc...
>> >>>
>> >>> I agree option 2 is the best choice.
>> >>>
>> >>> D.
>> >>>
>> >>> On Mon, Dec 14, 2009 at 12:29 AM, Ethan Jewett <esjewett@gmail.com>
>> wrote:
>> >>>> All,
>> >>>>
>> >>>> In the cwiki, we've documented 5 parts of the API we would like to
>> >>>> stream. They are briefly: user timeline, tags, tracks, conversation=
s,
>> >>>> and pools (and possibly the public timeline)
>> >>>>
>> >>>> Of these, one has been implemented: user timeline
>> >>>>
>> >>>> Today I've been able to take some time to start digging into what
>> >>>> needs to be done to implement the rest of the streaming interfaces.
>> >>>> The way the user timeline streaming interface is implemented in the
>> >>>> old and new APIs is the same (because I just copied and slightly
>> >>>> modified the code). The basic idea is that when a session is create=
d,
>> >>>> the streaming API starts "listening" for new messages. When the use=
r
>> >>>> makes a request to the streaming interface for new messages, all th=
e
>> >>>> messages that have built up are delivered.
>> >>>>
>> >>>> This approach poses some significant problems for other types of
>> >>>> streams. For example, if we were going to stream tags in this manne=
r,
>> >>>> we would end up creating a listener for every single active tag in =
the
>> >>>> system at the time the user initiates a session. We would also have
>> >>>> the dilemma of creating listeners for new tags as the tags are crea=
ted
>> >>>> in the middle of a session.
>> >>>>
>> >>>> As such, I'm thinking of implementing the other streaming interface=
s
>> >>>> differently. Instead of creating listeners when the session is
>> >>>> initiated, I'll create them when the first streaming request for a
>> >>>> tag, pool, track, or conversation comes in. These listeners would t=
hen
>> >>>> live on for the rest of the session. This is, I think the best of
>> >>>> several options.
>> >>>>
>> >>>> To summarize the options available:
>> >>>>
>> >>>> 1. Create listeners for everything at the beginning of the session =
-
>> >>>> not efficient, suffers from difficulties with new tags, pools, etc.
>> >>>> created during the session
>> >>>>
>> >>>> 2. Create listeners for streams as the user requests them and have
>> >>>> these listeners live on for the rest of the session
>> >>>>
>> >>>> 3. Create disposable listeners for each streaming/long-polling requ=
est
>> >>>> that are destroyed once the request is answered - this is problemat=
ic
>> >>>> because messages that occur between requests will be missed
>> >>>>
>> >>>> 4. Variation of option 2 and 3: Create listeners for streams as the
>> >>>> user requests them and have these listeners life on for the rest of
>> >>>> the session or a specific period of time, whichever comes first (so
>> >>>> the user would have to make occasional requests to ensure the
>> >>>> continuity of the message stream) - I think this is over-complicate=
d
>> >>>> and potentially confusing to developers, but could be a good option=
 if
>> >>>> we run into performance problems with option 2
>> >>>>
>> >>>>
>> >>>> What we'll be left with is that the user timeline will use option 1
>> >>>> and the other streams will use option 2. The user timeline might
>> >>>> switch to option 2 at some point in the future.
>> >>>>
>> >>>> And that was all a very long way of saying, does that sound reasona=
ble
>> >>>> to everyone?
>> >>>>
>> >>>> Ethan
>> >>>>
>> >>>
>> >>
>> >
>>
>
>
>
> --
> ---
> Daniel Koller
> Jahnstrasse 20
> 80469 M=FCnchen * dakoller@googlemail.com
>