incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Williams <>
Subject Re: new queue capability
Date Fri, 28 Feb 2014 11:59:50 GMT
On Thu, Feb 27, 2014 at 11:10 AM, Aaron McCurry <> wrote:
> What if we provide an implementation of the QueueReader concept that does
> what you are discussing.  That way in more extreme cases when the user is
> forced into implementing the lower level api (perhaps for performance) they
> can still do it, but for the normal case the partitioning (and other
> difficult issues) are handled by the controllers.

That may be good longer term - I'd be supportive of pulling it from
the shards for now and focus some time fully baking the simpler idea
in the controller.  Or at least disclaiming there be dragons:)

> I could see adding an enqueueMutate call to the controllers that pushes the
> mutates to the correct buckets for the user.  At the same time we could
> allow each of the controllers to pull from an external and push the mutates
> to the correct buckets for the shards.  I could see a couple of different
> ways of handling this.

Not sure what you mean by enqueueMutate - I was thinking just taking
the existing QueueReader and plugging it into the Controller (with
some leader election) - obviously calling mutateRow instead of the
current behavior.

Any more than one controller and we have to either expose the
partitioning or protect against dupes, right?

> However I do agree that right now there is too much burden on the user for
> the 95% case.  We should make this simpler.

Yeah, from a user perspective, I think I'd ask these questions of the Blur API:
o) What "streams" are available? (e.g. twitter, jms, kafka)
o) Create an instance of a stream (e.g.
client.createStreamTable(type:twitter, name:apache))
- o) Add stream-specific arguments to the table (e.g. twitter search criteria)
o) Add one or more Filter's to a stream table (e.g. the default
twitter stream might index 'mentions' but the user might add a Filter
to drop that column) or drop whole messages.
o) Start/Stop the stream.
o) Get metrics on the stream's activity.


> On Thu, Feb 27, 2014 at 10:07 AM, Tim Williams <> wrote:
>> I've been playing around with the new QueueReader stuff and I'm
>> starting to believe it's at the wrong level of abstraction - in the
>> shard context - for a user.
>> Between having to know about the BlurPartioner and handling all the
>> failure nuances, I'm thinking a much friendlier approach would be to
>> have the client implement a single message pump that Blur take's from
>> and handles.
>> Maybe on startup the Controllers compete for the lead QueueReader
>> position, create it from the TableContext and run with it?  The user
>> would still need to deal with  Controller failures but that seems
>> easier to reason about then shard failures.
>> The way it's crafted right now, the user seems burdened with a lot of
>> the hard problems that Blur otherwise solves.  Obviously, it trades
>> off a high burden for one of the controllers.
>> Thoughts?
>> --tim

View raw message