accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Hutchison <dhutc...@uw.edu>
Subject Re: another question on summing combiner
Date Thu, 22 Oct 2015 22:34:33 GMT
Hi Z,

It seems you have a fairly common use case: performing an update if and
only if a certain row does or does not exist.  Here's another option you
could try, and if it works (or doesn't work), please let us know!  Of
course, if you're comfortable with the batch solution, that is fine too.

*Adding an item x*
Conditionally write x to your main table, asserting that x does not exist.
+ If x is written (indicating that x did not previously exist in the main
table), then write a 1 to the stats table unconditionally.  ** Use of bloom
filters on the main table can speed this path up.  It sounds like this is
the more common path for you.
+ If x fails to write (indicating that x already existed in the main
table), then do not write to the stats table.

*Deleting an item x*
Conditionally delete x from your main table, asserting that x does exist.
+ If x is deleted, then write a -1 to your stats table unconditionally.
+ If x never existed, then do not write to your stats table. ** bloom
filters may speed this path up

Regards, Dylan


On Tue, Oct 20, 2015 at 7:33 AM, z11373 <z11373@outlook.com> wrote:

> Thanks Josh! I decided to leave the stats using normal combiner for now,
> the
> stats skew may not be that bad if it does happen.
> In the future, I am thinking to have a batch job that will update the stats
> correctly, it will be time intensive, but it should be ok since it'll
> likely
> run only once a day.
> Back to previous example below.
>
> Current stats table contains:
> foo     | 2
> bar     | 3
> test    | 1
>
> The batch job scan the main table, and going to update the stats table, let
> say the actual stats is foo=1, bar=4, test=1, it will first reads the
> values
> of existing stats above, and then 'calculate' the final result correctly,
> so
> it will just update stats table as:
> foo     | -1
> bar     | 1
>
> After this operation, the values in the stats table will end up correctly
> :-)
> foo     | 1
> bar     | 4
> test    | 1
>
>
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/another-question-on-summing-combiner-tp15238p15398.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message