hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Koch <ogd...@googlemail.com>
Subject Using timestamps as "transaction ids" for idempotent counters.
Date Fri, 24 Aug 2012 14:47:02 GMT
Hello,

I use a table for counting stuff and want to do updates by pushing
increments rather than get -> add in application -> put.

To ensure idempotence (i.e avoid over counting) I thought about (mis-)using
a cell's timestamp as a kind of <transaction id>. This transaction id would
be some strictly increasing number defined by the application writing the
increments, so let's call it <external_tmst>. I am looking for a call like:

incrementColumnValue(<row>, <colFam>, <counter_name>, <inc_value>,
<external_tmst>) //normal signature is without last argument

which applies the <inc_value> ONLY IF <external_tmst> is larger than the
cell's most recent version's timestamp (== last transaction id). This way,
if the external application attempts to re-insert the same data multiple
times no change would take place.

My questions are:
1. Is this a good idea to begin with?
2. Does the HBase client offer this kind of functionality, is it planned or
can it be implemented?

It appears that co-processors are able to handle this kind of logic but I
think I will be stuck with 0.90.6 for a while. I also heard about HBaseHUT (
https://github.com/sematext/HBaseHUT) but I am not sure it addresses the
issue of having idempotent counters.

Thank you,

/David

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message