hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: on the impact of incremental counters
Date Mon, 20 Jun 2011 15:50:31 GMT
Lazy increment on read causes the read to be expensive.  That might be a win
if the work load has lots of data that is never read.

This could be a good idea on average because my impression is that increment
is usually used for metric sorts of data which are often only read in detail
in diagnostic post mortem use cases.

On Mon, Jun 20, 2011 at 3:23 PM, Joey Echeverria <joey@cloudera.com> wrote:

> Is there any reason why the increment has to actually happen on
> insert? Couldn't an "increment record" be kept, and then the actual
> increment operation be performed lazily, on reads and compactions?
>
> -Joey
>
> On Mon, Jun 20, 2011 at 11:14 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
> >> From: Claudio Martella <claudio.martella@tis.bz.it>
> >> So, basically it's expensive to increment old data.
> >
> > HBase employs a buffer hierarchy to make updating a working set that can
> fit in RAM reasonably efficient. (But like I said there are some things
> remaining we can improve in terms of internal data structure management.)
> >
> > If you are updating a working set that does not fit in RAM or
> infrequently such that the value is not maintained in cache, then HBase has
> to go to disk and we move from the order of memory access to the order of
> disk access.
> >
> > It will obviously be more expensive to increment old data than newer, but
> I'm not sure I understand what you are getting at. Any data management
> system with a buffer hierarchy has this behavior.
> >
> > Compared to what?
> >
> >   - Andy
> >
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message