accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: Trigger for Accumulo table
Date Tue, 08 Dec 2015 21:18:50 GMT
Look at org.apache.accumulo.core.constraints.Constraint for a description
and org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as an
example.

In short, Mutations which are live-ingested into a tablet server are
validated against constraints you specify on the table. That means that all
Mutations written to a table go through this bit of user-provided code at
least once. You could use that fact to your advantage. However, this would
be highly experimental and might have some caveats to consider.

You can configure a constraint on a table with
connector.tableOperations().addConstraint(...)

On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo <baothaingo@gmail.com> wrote:

> Christopher,
>
> This is interesting! Could you please give me more details about this?
>
> Thanks,
> Thai
>
> On Thu, Dec 3, 2015 at 12:17 PM, Christopher <ctubbsii@apache.org> wrote:
>
>> You could also implement a constraint to notify an external system when a
>> row is updated.
>>
>> On Wed, Dec 2, 2015, 22:54 Josh Elser <josh.elser@gmail.com> wrote:
>>
>>> oops :)
>>>
>>> [1] http://fluo.io/
>>>
>>> Josh Elser wrote:
>>> > Hi Thai,
>>> >
>>> > There is no out-of-the-box feature provided with Accumulo that does
>>> what
>>> > you're asking for. Accumulo doesn't provide any functionality to push
>>> > notifications to other systems. You could potentially maintain other
>>> > tables/columns in which you maintain the last time a row was updated,
>>> > but the onus is on your "other services" to read the table to find out
>>> > when a change occurred (which is probably not scalable at "real time").
>>> >
>>> > There are other systems you could likely leverage to solve this,
>>> > depending on the durability and scalability that your application
>>> needs.
>>> >
>>> > For a system "close" to Accumulo, you could take a look at Fluo [1]
>>> > which is an implementation of Google's "Percolator" system. This is a
>>> > system based on throughput rather than low-latency, so it may not be a
>>> > good fit for your needs. There are probably other systems in the Apache
>>> > ecosystem (Kafka, Storm, Flink or Spark Streaming maybe?) that are be
>>> > helpful to your problem. I'm not an expert on these to recommend on
>>> (nor
>>> > do I think I understand your entire architecture well enough).
>>> >
>>> > Thai Ngo wrote:
>>> >> Hi list,
>>> >>
>>> >> I have a use-case when existing rows in a table will be updated by an
>>> >> internal service. Data in a row of this table is composed of 2 parts:
>>> >> 1st part - immutable and the 2nd one - will be updated (filled in) a
>>> >> little later.
>>> >>
>>> >> Currently, I have a need of knowing when and which rows will be
>>> updated
>>> >> in the table so that other services will be wisely start consuming the
>>> >> data. It will make more sense when I need to consume the data in near
>>> >> realtime. So developing a notification function or simpler - a trigger
>>> >> is what I really want to do now.
>>> >>
>>> >> I am curious to know if someone has done similar job or there are
>>> >> features or APIs or best practices available for Accumulo so far. I'm
>>> >> thinking of letting the internal service which updates the data notify
>>> >> us whenever it updates the data.
>>> >>
>>> >> What do you think?
>>> >>
>>> >> Thanks,
>>> >> Thai
>>>
>>
>

Mime
View raw message