streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danny Sullivan <>
Subject Process Subscribers? or Activities?
Date Fri, 22 Nov 2013 19:58:10 GMT
Hey all,
I was thinking recently about the most efficient way to deliver activities to publishers.
Let's say that each subscriber has a list strings that represent filters. A subscriber with
the filter "myactor" would be delivered an activity such as {actor:"myactor", verb:"blah",
taget:"target"}. We can ensure that this activity is delivered this subscriber one of two

One (which is the current implementation) is that we iterate through each subscriber, access
their filters, find all activity that matches these filters is added to this subscriber's
stream in the subscriber warehouse

The second implementation would involve iterating through each activity as it comes in and
check a table with a filter as a key and a list of subscribers as the value. This list of
subscribers would all have this filter as one of their filters. So if there was two subscribers
subscriber1 : {filters: ["filter1", "filter2"]} and subscriber2:{filters:["filter2", "filter3"]},
the filter table would look something like: filter1: ["subscriber1"], filter2:{"subscriber1",
"subscriber2"}, filter3:{"subscriber2"}. You would then distribute this activity to each of
the matching subscribers in the subscriber warehouse. 

The benefits of the first implementation is that there will be fewer subscribers than activities,
so less iteration. The con is that the query that you use for each subscriber is fairly expensive
(select * from activities where published > lastUpdated and actor in filters or verb in
filters or target in filters). The benefits of the second implementation are that you process
the activity as it comes in and then never really have to worry about it again. Also, the
query seems much less expensive. The downside is that you miss out on history, so a newly
created subscriber will have no activity until it's published. 

I'm still in favor of the first implementation, but does anyone have any other thoughts? Perhaps
there can be a mixture of both?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message