accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Fwd: Re: Trigger for Accumulo table
Date Wed, 09 Dec 2015 01:38:36 GMT
(moving future discussion on listener hooks to dev@a.a.o)

We should take a look at HBase. They have an Observer API that runs 
server side which might serve as a good starting point. IIRC, it's 
designed for implementing this kind of functionality.

-------- Original Message --------
Subject: 	Re: Trigger for Accumulo table
Date: 	Tue, 8 Dec 2015 18:28:17 -0500
From: 	Adam Fuchs <afuchs@apache.org>
Reply-To: 	user@accumulo.apache.org
To: 	user@accumulo.apache.org



I totally agree, Christopher. I have also run into a few situations
where it would have been nice to have something like a mutation listener
hook. Particularly in generating indexing and stats records.

Adam


On Tue, Dec 8, 2015 at 5:59 PM, Christopher <ctubbsii@apache.org
<mailto:ctubbsii@apache.org>> wrote:

     In the future, it might be useful to provide a supported API hook
     here. It certainly would've made implementing replication easier,
     but could also be useful as a notification system.

     On Tue, Dec 8, 2015 at 4:51 PM Keith Turner <keith@deenlo.com
     <mailto:keith@deenlo.com>> wrote:

         Constraints are checked before data is written.  In the case of
         failures a constraint may see data thats never successfully 
written.

         On Tue, Dec 8, 2015 at 4:18 PM, Christopher <ctubbsii@apache.org
         <mailto:ctubbsii@apache.org>> wrote:

             Look at org.apache.accumulo.core.constraints.Constraint for
             a description and
 
org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as
             an example.

             In short, Mutations which are live-ingested into a tablet
             server are validated against constraints you specify on the
             table. That means that all Mutations written to a table go
             through this bit of user-provided code at least once. You
             could use that fact to your advantage. However, this would
             be highly experimental and might have some caveats to consider.

             You can configure a constraint on a table with
             connector.tableOperations().addConstraint(...)


             On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo
             <baothaingo@gmail.com <mailto:baothaingo@gmail.com>> wrote:

                 Christopher,

                 This is interesting! Could you please give me more
                 details about this?

                 Thanks,
                 Thai

                 On Thu, Dec 3, 2015 at 12:17 PM, Christopher
                 <ctubbsii@apache.org <mailto:ctubbsii@apache.org>> wrote:

                     You could also implement a constraint to notify an
                     external system when a row is updated.


                     On Wed, Dec 2, 2015, 22:54 Josh Elser
                     <josh.elser@gmail.com <mailto:josh.elser@gmail.com>>
                     wrote:

                         oops :)

                         [1] http://fluo.io/

                         Josh Elser wrote:
                          > Hi Thai,
                          >
                          > There is no out-of-the-box feature provided
                         with Accumulo that does what
                          > you're asking for. Accumulo doesn't provide
                         any functionality to push
                          > notifications to other systems. You could
                         potentially maintain other
                          > tables/columns in which you maintain the last
                         time a row was updated,
                          > but the onus is on your "other services" to
                         read the table to find out
                          > when a change occurred (which is probably not
                         scalable at "real time").
                          >
                          > There are other systems you could likely
                         leverage to solve this,
                          > depending on the durability and scalability
                         that your application needs.
                          >
                          > For a system "close" to Accumulo, you could
                         take a look at Fluo [1]
                          > which is an implementation of Google's
                         "Percolator" system. This is a
                          > system based on throughput rather than
                         low-latency, so it may not be a
                          > good fit for your needs. There are probably
                         other systems in the Apache
                          > ecosystem (Kafka, Storm, Flink or Spark
                         Streaming maybe?) that are be
                          > helpful to your problem. I'm not an expert on
                         these to recommend on (nor
                          > do I think I understand your entire
                         architecture well enough).
                          >
                          > Thai Ngo wrote:
                          >> Hi list,
                          >>
                          >> I have a use-case when existing rows in a
                         table will be updated by an
                          >> internal service. Data in a row of this
                         table is composed of 2 parts:
                          >> 1st part - immutable and the 2nd one - will
                         be updated (filled in) a
                          >> little later.
                          >>
                          >> Currently, I have a need of knowing when and
                         which rows will be updated
                          >> in the table so that other services will be
                         wisely start consuming the
                          >> data. It will make more sense when I need to
                         consume the data in near
                          >> realtime. So developing a notification
                         function or simpler - a trigger
                          >> is what I really want to do now.
                          >>
                          >> I am curious to know if someone has done
                         similar job or there are
                          >> features or APIs or best practices available
                         for Accumulo so far. I'm
                          >> thinking of letting the internal service
                         which updates the data notify
                          >> us whenever it updates the data.
                          >>
                          >> What do you think?
                          >>
                          >> Thanks,
                          >> Thai





Mime
View raw message