kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: kudu - maintenance operations and row compactions
Date Thu, 29 Jun 2017 05:46:54 GMT
Hi Prasanth,

Great that you are using Kudu at CERN and happy overall. We all enjoyed
reading the blog post that Zbigniew wrote about your experiences.

The temporary blow-up in space is an interesting effect that I wouldn't
have expected. During an ongoing insert, I would expect that there is some
space used for 'UNDO' deltas -- these are the records of the original
insertion time of the row, and allow snapshot queries to properly avoid
returning rows inserted past the selected snapshot timestamp. These records
will be larger if you have a particularly large composite key -- especially
in time series workloads they can be pretty big. They are only retained for
15 minutes, though, assuming that background tasks are allowed to run. In
an ongoing workload, it's possible that flushes and compactions will be
prioritized over removal of these UNDO deltas and thus grow over time, but
it's hard to definitively say that this is the case that you're hitting.

Another case that might cause these issues is if there are a lot of UPSERT
or UPDATEs going into the table -- in that case we will retain past
versions of the row, and those past versions are not stored in a columnar
format. Hence they can take substantial amounts of disk space. Again, they
are only retained for 15 minutes by default assuming there is some idle
capacity to remove them, and if we aren't properly prioritizing their
removal they may grow over time until the write workload becomes more idle.

Given you are running Kudu 1.4, I'm guessing maybe you are compiling from
source. In that case you might try
cherry-picking c19b8f4a1a271af1efb5a01bdf05005d79bb85f6 (and probably will
need its parent 96ad3b07cf1dc694ddcfd72405aeb662440199b5). These commits
add a 'kudu local_replica data_size' command which can be used against a
local tablet server to break down space usage by different consumers such
as UNDO deltas, REDO deltas, base data, etc. If you're able to cherry-pick
those I'd be interested to hear the results on your workload. Maybe we can
tweak the maintenance process to prioritize data garbage collection more
aggressively in certain circumstances.


On Wed, Jun 28, 2017 at 10:16 PM, Prasanth Kothuri <prasanth.kothuri@cern.ch
> wrote:

> Hello There
> I am using kudu @ CERN with positive experience and thanks for the
> performance improvements in 1.4!
> I have recently encountered an issue which I am unable to work around, it
> is as follows
> I have a 18 node kudu cluster each with 32 cores, 128GB memory and 2
> disks. Using Spark API, I am inserting data into kudu table at the
> sustained rate of 750k per second (which is awesome), after few days my
> filesystems were becoming full ( 18 * 3TB = 54TB) even though the
> on_disk_size reported in the metrics is around 4-5 TB. The filesystems come
> back to the expected size after I stop the insertion for 6-8 hours, so I
> suspect some post processing like rowset compactions were unable to keep up
> with the insertion rate. I do have spare resources on the nodes, please can
> you point me how I can troubleshoot this issue or any parameters changes
> which can fasten these maintenance operations (I currently have
> --maintenance_manager_num_threads=20).
> Any help / clues where to look is highly appreciated.
> Best Regards,
> Prasanth

Todd Lipcon
Software Engineer, Cloudera

View raw message