Yep, that's right -- currently the only thing that reclaims space taken by deleted rows is a RowSet merge compaction. We haven't added any logic to trigger those based on the number of deleted rows in a RowSet; they are currently only triggered by logic which tries to merge RowSets with overlapping key ranges (see https://github.com/apache/kudu/blob/master/docs/design-docs/compaction-policy.md#intuition-behind-compaction-selection-policy and BudgetedCompactionPolicy::PickRowSets()).

The follow-up work to add a background task to permanently remove deleted rows is being tracked in https://issues.apache.org/jira/browse/KUDU-1979 (which I just filed).

Mike

On Mon, Apr 24, 2017 at 12:37 PM, Todd Lipcon <todd@cloudera.com> wrote:
Mike can correct me if wrong, but I think the background task in 1.3 is only responsible for removing old deltas, and doesn't do anything to try to trigger compactions on rowsets with a high percentage of deleted _rows_.

That's a separate bit of work that hasn't been started yet.

-Todd

On Sat, Apr 22, 2017 at 7:36 PM, Jason Heo <jason.heo.sde@gmail.com> wrote:
Hi David.

Thank you for your reply.

I'll try to upgrade to 1.3 this week.

Regards,

Jason

2017-04-23 2:06 GMT+09:00 <davidralves@gmail.com>:
Hi Jason

  In Kudu 1.2 if there are compactions happening, they will reclaim
space. Unfortunately the conditions for this to happen don't always
occur (if the portion of the keyspace where the deletions occurred
stopped receiving writes and was already fully compacted cleanup is
more unlikely)
  In Kudu 1.3 we added a background task to clean up old data even in
the absence of compactions. Could you upgrade?

Best
David




--
Todd Lipcon
Software Engineer, Cloudera