accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From z11373 <>
Subject delete rows test result
Date Mon, 16 Nov 2015 15:35:11 GMT
Last week on separate thread I was suggested to use
tableOperations.deleteRows for deleting rows that matched with specific
ranges. So I was curious to try it out to see if it's better than my current
implementation which is iterating all rows, and call putDelete for each.
While researching, I also found Accumulo already provides BatchDeleter,
which also does the same thing.
I tried all of three, and below is my test results against three different
tables (numbers are in milliseconds):

Test 1 (using iterator and call putDelete for each):
Table 1: 5,702
Table 2: 6,912
Table 3: 4,694

Test 2 (using BatchDeleter class):
Table 1: 8,089
Table 2: 10,405
Table 3: 7,818

Test 3 (using tableOperations.deleteRows, note that I first iterate all
rows, just to get the last row id, which then being passed as argument to
the function):
Table 1: 196,597
Table 2: 226,496
Table 3: 8,442

I ran the tests few times, and pretty much got the consistent results above.
I didn't look at the code what deleteRows really doing, but looking at my
test results, I can say it sucks!
Note that for that test, I did scan and iterate just to get the last row id,
but even I subtract the time for doing that, it's still way too slow.
Therefore, I'd recommend anyone to avoid using deleteRows for this scenario.
YMMV, but I'd stick with my original approach, which is doing the same like
Test 1 above.


View this message in context:
Sent from the Developers mailing list archive at

View raw message