accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From z11373 <>
Subject Re: delete rows test result
Date Mon, 23 Nov 2015 16:43:36 GMT
Hi William,
I re-ran the same test calling deleteRows without scanning the table (so
it's only timing the deleteRows operation here), and you're right, it's
faster as shown in the result below.

Table 1: 3,301 
Table 2: 3,184 
Table 3: 2,635

It's definitely faster, as comparison to the fastest result I got by
scanning the table and calling putDelete for each, in the result below.

Table 1: 5,702 
Table 2: 6,912 
Table 3: 4,694

However, there is one case I didn't mention last time, which the table has
summing combiner installed. So even it may have 1M rows, but actually it can
have rows as many as 10M or beyond, which may explain why deleteRows can
take longer. Still, it seems something wrong looking at my test result.

Test 1 (using iterator and call putDelete for each):
Table 4 (with summing combiner): 11,081

Test 2 (calling deleteRows):
Table 4 (with summing combiner): 197,050

Last time I heard someone mentioned about compaction, so I was curious, and
do following test to compact first before calling deleteRows (to see if it'd
be faster), and here is the result:
Compact on Table 4 (with summing combiner): 376,619
Call deleteRows on Table 4 (with summing combiner): 188,862

So given the result above, I'd say the table compaction doesn't help.
Perhaps I did something wrong here. Therefore, it seems to me, for certain
case (like this one) scanning table and calling putDelete for each, will
perform better than calling deleteRows, does this make sense?


View this message in context:
Sent from the Developers mailing list archive at

View raw message