accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From z11373 <>
Subject another question on summing combiner
Date Mon, 21 Sep 2015 15:35:57 GMT
Last time I posted about using summing combiners to build the stats table.
For example, when adding item to the main table, it'd insert that item with
value 1 to the stats table that has the summing combiner attached. Same for
deleting from main table, it'd insert with value -1.
This works fine, except for cases like:
1. insert item already exists (i.e. same key) in the main table
2. delete item that doesn't exist in the main table

Either case above will unfortunately cause data in stats table become
Though the stats data doesn't need to be precise (it's more for the
optimizer in our app to get rough idea of total items), but if either or
both cases happen a lot, then it may screw the optimizer.
I can think of 2 options to take care this problem:
1. Check the existing data before insert/delete, this will incur performance
which will defeat one of summing combiner benefits, which is no need to
check existing data, and let combiner does its job
2. Have a job to 'fix' the stats by recalculating everything (i.e. read from
main table and rebuild the stats table). This is expensive, but it can be
run once a day, so may not be a terrible idea

Let me know if any of you have better solution than these.


View this message in context:
Sent from the Developers mailing list archive at

View raw message