accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thai Ngo <>
Subject Re: Trigger for Accumulo table
Date Mon, 07 Dec 2015 03:45:50 GMT
Hi Josh,

On Thu, Dec 3, 2015 at 10:50 AM, Josh Elser <> wrote:

> Hi Thai,
> There is no out-of-the-box feature provided with Accumulo that does what
> you're asking for. Accumulo doesn't provide any functionality to push
> notifications to other systems. You could potentially maintain other
> tables/columns in which you maintain the last time a row was updated, but
> the onus is on your "other services" to read the table to find out when a
> change occurred (which is probably not scalable at "real time").

You're absolutely right here. Reading the table to find out when and where
a change occurred is not a good way to go. Furthermore, introducing new
states into our current system (which is stateless at this moment) and
maintaining them is not a good idea either.

> There are other systems you could likely leverage to solve this, depending
> on the durability and scalability that your application needs.
> For a system "close" to Accumulo, you could take a look at Fluo [1] which
> is an implementation of Google's "Percolator" system. This is a system
> based on throughput rather than low-latency, so it may not be a good fit
> for your needs. There are probably other systems in the Apache ecosystem
> (Kafka, Storm, Flink or Spark Streaming maybe?) that are be helpful to your
> problem. I'm not an expert on these to recommend on (nor do I think I
> understand your entire architecture well enough).

Good news to hear about Fluo and will look at it and see how different it
is from (HBase) Coprocessors.

I do use Kafka in the current system but I do not think I need Kafka for
the purpose because i) it is probably overkilled and ii) I do not want to
move (changed) data back and forth.

BTW, My current approach is with Zookeeper and I have fun with this.

Thanks for your time.


> Thai Ngo wrote:
>> Hi list,
>> I have a use-case when existing rows in a table will be updated by an
>> internal service. Data in a row of this table is composed of 2 parts:
>> 1st part - immutable and the 2nd one - will be updated (filled in) a
>> little later.
>> Currently, I have a need of knowing when and which rows will be updated
>> in the table so that other services will be wisely start consuming the
>> data. It will make more sense when I need to consume the data in near
>> realtime. So developing a notification function or simpler - a trigger
>> is what I really want to do now.
>> I am curious to know if someone has done similar job or there are
>> features or APIs or best practices available for Accumulo so far. I'm
>> thinking of letting the internal service which updates the data notify
>> us whenever it updates the data.
>> What do you think?
>> Thanks,
>> Thai

View raw message