kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasanth Kothuri <prasanth.koth...@cern.ch>
Subject kudu - maintenance operations and row compactions
Date Thu, 29 Jun 2017 05:16:03 GMT
Hello There

I am using kudu @ CERN with positive experience and thanks for the performance improvements
in 1.4!
I have recently encountered an issue which I am unable to work around, it is as follows

I have a 18 node kudu cluster each with 32 cores, 128GB memory and 2 disks. Using Spark API,
I am inserting data into kudu table at the sustained rate of 750k per second (which is awesome),
after few days my filesystems were becoming full ( 18 * 3TB = 54TB) even though the on_disk_size
reported in the metrics is around 4-5 TB. The filesystems come back to the expected size after
I stop the insertion for 6-8 hours, so I suspect some post processing like rowset compactions
were unable to keep up with the insertion rate. I do have spare resources on the nodes, please
can you point me how I can troubleshoot this issue or any parameters changes which can fasten
these maintenance operations (I currently have --maintenance_manager_num_threads=20).

Any help / clues where to look is highly appreciated.

Best Regards,
Prasanth
CERN IT

Mime
View raw message