streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig McClanahan <craig...@gmail.com>
Subject Re: Streams Subscriptions
Date Fri, 01 Feb 2013 17:17:37 GMT
A couple of thoughts.

* On "outputs" you list "username" and "password" as possible fields.
  I presume that specifying these would imply using HTTP Basic auth?
  We might want to consider different options as well.

* From my (possibly myopic :-) viewpoint, the filtering and delivery
  decisions are different object types.  I'd like to be able to register
  my set of filters and get a unique identifier for them, and then
  separately be able to say "send the results of subscription 123
  to this webhook URL every 60 minutes".

* Regarding query syntax, pretty much any sort of simple patterns
  are probably not going to be sufficient for some use cases.  Maybe
  we should offer that as simple defaults, but also support falling back
  to some sort of SQL-like syntax (i.e. what JIRA does on the
  advanced search).

Craig

On Fri, Feb 1, 2013 at 8:55 AM, Jason Letourneau <jletourneau80@gmail.com>wrote:

> Based on Steve and Craig's feedback, I've come up with something that
> I think can work.  Below it specifies that:
> 1) you can set up more than one subscription at a time
> 2) each subscription can have many outputs
> 3) each subscription can have many filters
>
> The details of the config would do things like determine the behavior
> of the stream delivery (is it posted back or is the subscriber polling
> for instance).  Also, all subscriptions created in this way would be
> accessed through a single URL.
>
> {
>     "auth_token": "token",
>     "subscriptions": [
>         {
>             "outputs": [
>                 {
>                     "output_type": "http",
>                     "method": "post",
>                     "url": "http.example.com:8888",
>                     "delivery_frequency": "60",
>                     "max_size": "10485760",
>                     "auth_type": "none",
>                     "username": "username",
>                     "password": "password"
>                 }
>             ]
>         },
>         {
>             "filters": [
>                 {
>                     "field": "fieldname",
>                     "comparison_operator": "operator",
>                     "value_set": [
>                         "val1",
>                         "val2"
>                     ]
>                 }
>             ]
>         }
>     ]
> }
>
> Thoughts?
>
> Jason
>
> On Thu, Jan 31, 2013 at 7:53 PM, Craig McClanahan <craigmcc@gmail.com>
> wrote:
> > Welcome Steve!
> >
> > DataSift's UI to set these things up is indeed pretty cool.  I think what
> > we're talking about here is more what the internal REST APIs between the
> UI
> > and the back end might look like.
> >
> > I also think we should deliberately separate the filter definition of a
> > "subscription" from the instructions on how the data gets delivered.  I
> > could see use cases for any or all of:
> > * Polling with a filter on oldest date of interest
> > * Webhook that gets updated at some specified interval
> > * URL to which the Streams server would periodically POST
> >   new activities (in case I don't have webhooks set up)
> >
> > Separately, looking at DataSift is a reminder we will want to be able to
> > filter on words inside an activity stream value like "subject" or
> > "content", not just on the entire value.
> >
> > Craig
> >
> > On Thu, Jan 31, 2013 at 4:29 PM, Jason Letourneau
> > <jletourneau80@gmail.com>wrote:
> >
> >> Hi Steve - thanks for the input and congrats on your first post - I
> >> think what you are describing is where Craig and I are circling around
> >> (or something similar anyways) - the details on that POST request are
> >> really helpful in particular.  I'll try and put something together
> >> tomorrow that would be a start for the "setup" request (and subsequent
> >> additional configuration after the subscription is initialized) and
> >> post back to the group.
> >>
> >> Jason
> >>
> >> On Thu, Jan 31, 2013 at 7:00 PM, Steve Blackmon [W2O Digital]
> >> <sblackmon@w2odigital.com> wrote:
> >> > First post from me (btw I am Steve, stoked about this project and
> meeting
> >> > everyone eventually.)
> >> >
> >> > Sorry if I missed the point of the thread, but I think this is related
> >> and
> >> > might be educational for some in the group.
> >> >
> >> > I like the way DataSift's API lets you establish streams - you POST a
> >> > definition, it returns a hash, and thereafter their service follows
> the
> >> > instructions you gave it as new messages meet the filter you defined.
>  In
> >> > addition, once a stream exists, then you can set up listeners on that
> >> > specific hash via web sockets with the hash.
> >> >
> >> > For example, here is how you instruct DataSift to push new messages
> >> > meeting your criteria to a WebHooks end-point.
> >> >
> >> > curl -X POST 'https://api.datasift.com/push/create' \
> >> > -d 'name=connectorhttp' \
> >> > -d 'hash=dce320ce31a8919784e6e85aecbd040e' \
> >> > -d 'output_type=http' \
> >> > -d 'output_params.method=post' \
> >> > -d 'output_params.url=http.example.com:8888' \
> >> > -d 'output_params.use_gzip' \
> >> > -d 'output_params.delivery_frequency=60' \
> >> > -d 'output_params.max_size=10485760' \
> >> > -d 'output_params.verify_ssl=false' \
> >> > -d 'output_params.auth.type=none' \
> >> > -d 'output_params.auth.username=YourHTTPServerUsername' \
> >> > -d 'output_params.auth.password=YourHTTPServerPassword' \
> >> > -H 'Auth: datasift-user:your-datasift-api-key
> >> >
> >> >
> >> > Now new messages get pushed to me every 60 seconds, and I can get the
> >> feed
> >> > in real-time like this:
> >> >
> >> > var websocketsUser = 'datasift-user';
> >> > var websocketsHost = 'websocket.datasift.com';
> >> > var streamHash = 'dce320ce31a8919784e6e85aecbd040e';
> >> > var apiKey = 'your-datasift-api-key';
> >> >
> >> >
> >> > var ws = new
> >> >
> >>
> WebSocket('ws://'+websocketsHost+'/'+streamHash+'?username='+websocketsUser
> >> > +'&api_key='+apiKey);
> >> >
> >> > ws.onopen = function(evt) {
> >> >     // connection event
> >> >         $("#stream").append('open: '+evt.data+'<br/>');
> >> > }
> >> >
> >> > ws.onmessage = function(evt) {
> >> >     // parse received message
> >> >         $("#stream").append('message: '+evt.data+'<br/>');
> >> > }
> >> >
> >> > ws.onclose = function(evt) {
> >> >     // parse event
> >> >         $("#stream").append('close: '+evt.data+'<br/>');
> >> > }
> >> >
> >> > // No event object is passed to the event callback, so no useful
> >> debugging
> >> > can be done
> >> > ws.onerror = function() {
> >> >     // Some error occurred
> >> >         $("#stream").append('error: '+evt.data+'<br/>');
> >> > }
> >> >
> >> >
> >> > At W2OGroup we have built utility libraries for receiving and
> processing
> >> > Json object streams from data sift in Storm/Kafka that I'm interested
> in
> >> > extending to work with Streams, and can probably commit to the
> project if
> >> > the community would find them useful.
> >> >
> >> >
> >> > Steve Blackmon
> >> > Director, Data Sciences
> >> >
> >> > 101 W. 6th Street
> >> > Austin, Texas 78701
> >> > cell 512.965.0451 | work 512.402.6366
> >> > twitter @steveblackmon
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On 1/31/13 5:45 PM, "Craig McClanahan" <craigmcc@gmail.com> wrote:
> >> >
> >> >>We'll probably want some way to do the equivalent of ">", ">=",
"<",
> >> "<=",
> >> >>and "!=" in addition to the implicit "equal" that I assume you mean
in
> >> >>this
> >> >>example.
> >> >>
> >> >>Craig
> >> >>
> >> >>On Thu, Jan 31, 2013 at 3:39 PM, Jason Letourneau
> >> >><jletourneau80@gmail.com>wrote:
> >> >>
> >> >>> I really like this - this is somewhat what I was getting at with
the
> >> >>> JSON object i.e. POST:
> >> >>> {
> >> >>> "subscriptions":
> >> >>> [{"activityField":"value"},
> >> >>> {"activityField":"value",
> >> >>>  "anotherActivityField":"value" }
> >> >>> ]
> >> >>> }
> >> >>>
> >> >>> On Thu, Jan 31, 2013 at 4:32 PM, Craig McClanahan <
> craigmcc@gmail.com>
> >> >>> wrote:
> >> >>> > On Thu, Jan 31, 2013 at 12:00 PM, Jason Letourneau
> >> >>> > <jletourneau80@gmail.com>wrote:
> >> >>> >
> >> >>> >> I am curious on the group's thinking about subscriptions
to
> activity
> >> >>> >> streams.  As I am stubbing out the end-to-end heartbeat
on my
> >> >>>proposed
> >> >>> >> architecture, I've just been working with URL sources
as the
> >> >>> >> subscription mode.  Obviously this is a way over-simplification.
> >> >>> >>
> >> >>> >> I know for shindig the social graph can be used, but we
don't
> >> >>> >> necessarily have that.  Considering the mechanism for
> establishing a
> >> >>> >> new subscription stream (defined as aggregated individual
> activities
> >> >>> >> pulled from a varying array of sources) is POSTing to
the
> Activity
> >> >>> >> Streams server to establish the channel (currently just
a
> >> >>> >> subscriptions=url1,url2,url3 is the over simplified
> >> mechanism)...what
> >> >>> >> would people see as a reasonable way to establish subscriptions?
> >> >>>List
> >> >>> >> of userIds? Subjects?  How should these be represented?
 I was
> >> >>> >> thinking of a JSON object, but any one have other thoughts?
> >> >>> >>
> >> >>> >> Jason
> >> >>> >>
> >> >>> >
> >> >>> > One idea would be take some inspiration from how JIRA lets
you (in
> >> >>> effect)
> >> >>> > create a WHERE clause that looks at any fields (in all the
> activities
> >> >>> > flowing through the server) that you want.
> >> >>> >
> >> >>> > Example filter criteria
> >> >>> > * provider.id = 'xxx' // Filter on a particular provider
> >> >>> > * verb = 'yyy'
> >> >>> > * object.type = 'blogpost'
> >> >>> > and you'd want to accept more than one value (effectively
> creating OR
> >> >>>or
> >> >>> IN
> >> >>> > type clauses).
> >> >>> >
> >> >>> > For completeness, I'd want to be able to specify more than
one
> filter
> >> >>> > expression in the same subscription.
> >> >>> >
> >> >>> > Craig
> >> >>>
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message