accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: delete rows test result
Date Mon, 30 Nov 2015 22:39:34 GMT
Without ACCUMULO-3235, one way you can make deleteRows faster is to only
use it to delete rows on existing tablet boundaries. Even then, there may
be cases where it's going to do a chop compaction before it completes the
delete, and some tablets may be offline while it does this.

Aside from possibly only using existing tablet boundaries, I'm not sure
there is anything you can do which would be faster.

If the deleteMany (scan/putDelete) strategy is faster for you, and memory
is less important than speed, then stick with that. That's almost certainly
going to be better if the data you wish to delete is interspersed with data
you wish to keep.

deleteRows is going to work best in cases where you have large quantities
of sequential rows to delete, spanning more than one tablet. If your
application can tolerate it, you could wait for a significantly large run
before doing a delete. For instance, if you wish to age-off old data, and
your data is ordered by time, you could age off once a week instead of
daily, to allow the ranges of things to delete to build up.

On Mon, Nov 30, 2015 at 3:18 PM z11373 <z11373@outlook.com> wrote:

> Hi Christopher,
> Do you have any idea what should I do to improve the perf in my case, or
> wait until ACCUMULO-3235?
>
> If you look at my test results, calling deleteRows took >15x slower than
> calling putDelete for the same table and data. Is it because the actual
> number of rows (i.e. being combined) is a way bigger than the number of
> combined rows? I'd imagine if deleteRows has to delete 100M of rows, while
> putDelete may only need to deal with 3-4M of rows (results from combined),
> then it may explain why it'd take that long.
>
>
> Thanks,
> Z
>
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/delete-rows-test-result-tp15569p15637.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message