Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0DFB918CA2 for ; Thu, 24 Mar 2016 22:17:26 +0000 (UTC) Received: (qmail 26001 invoked by uid 500); 24 Mar 2016 22:17:25 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 25946 invoked by uid 500); 24 Mar 2016 22:17:25 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 25887 invoked by uid 99); 24 Mar 2016 22:17:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Mar 2016 22:17:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 974722C1F5D for ; Thu, 24 Mar 2016 22:17:25 +0000 (UTC) Date: Thu, 24 Mar 2016 22:17:25 +0000 (UTC) From: "Paul Wilkinson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-3434) ability to increment a counter without reading original value from storage MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15211053#comment-15211053 ] Paul Wilkinson commented on HBASE-3434: --------------------------------------- Hey folks, happy to take this on. The current prototype code (based on co-processors) is at https://github.com/paulmw/hbase-aggregation/tree/master/src/main/java/aggregation/coprocessor It's a work in progress for sure, but most of the ideas are in there. It aggregates data both during flushes and compactions, as well as during gets and scans. So counters are implemented simply by adding the co-processor and performing puts. It's very much not limited to summation though, as you can plug in a custom value aggregation function (by implementing https://github.com/paulmw/hbase-aggregation/blob/master/src/main/java/aggregation/coprocessor/ValueAccumulator.java). The decision on what cells to aggregate is also pluggable - the default is versions of the same cell (https://github.com/paulmw/hbase-aggregation/blob/master/src/main/java/aggregation/coprocessor/DefaultCellAccumulator.java, which implements CellAccumulator) but it's easy to imagine the kind of multi-level rollup you often get in time series - keeping 1 minute granularity for today, 10 minute granularity for the previous 6 days, hourly beyond that etc. So long as those values are all consecutive in KV terms, that's still possible in a stateless fashion. What's missing as yet is a design for how aggregation functions are registered - happy to take direction there. It's also possible it could become more supported in HBase itself, rather than in client land. Again, happy to take direction from folks here. It's certain though that there's a need to retain the custom aggregation part of this, rather than just doing a better version of counters. > ability to increment a counter without reading original value from storage > -------------------------------------------------------------------------- > > Key: HBASE-3434 > URL: https://issues.apache.org/jira/browse/HBASE-3434 > Project: HBase > Issue Type: Improvement > Components: Client, regionserver > Reporter: dhruba borthakur > Assignee: stack > Labels: gsoc2016, mentor > > There are a bunch of applications that do read-modify-write operations on HBase constructs, e.g a counter; The counter value has to be read in from hdfs before it can be incremented. We have an application where the number of increments on a counter far outnumbers the number of times the counter is used or read. For these type of applications, it will be very beneficial to not have to read in the counter from disk before it can be incremented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)