kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Quanlong Huang" <huang_quanl...@126.com>
Subject Why RowSet size is much smaller than flush_threshold_mb
Date Fri, 15 Jun 2018 14:52:53 GMT
Hi all,


I'm running kudu 1.6.0-cdh5.14.2. When looking into the logs of tablet server, I find most
of the compactions are compacting small files (~40MB for each). For example:


I0615 07:22:42.63735130614tablet.cc:1661] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Compaction: stage 1 complete, picked 4 rowsets to compact
I0615 07:22:42.63738530614compaction.cc:903] Selected 4 rowsets to compact:
I0615 07:22:42.63739330614compaction.cc:906] RowSet(343)(current size on disk: ~40666600 bytes)
I0615 07:22:42.63740130614compaction.cc:906] RowSet(1563)(current size on disk: ~34720852
bytes)
I0615 07:22:42.63740830614compaction.cc:906] RowSet(1645)(current size on disk: ~29914833
bytes)
I0615 07:22:42.63741530614compaction.cc:906] RowSet(1870)(current size on disk: ~29007249
bytes)
I0615 07:22:42.63742830614tablet.cc:1447] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Compaction: entering phase 1 (flushing snapshot). Phase 1 snapshot: MvccSnapshot[committed={T|T
< 6263071556616208384 or (T in {6263071556616208384})}]
I0615 07:22:42.64158230614multi_column_writer.cc:103] Opened CFile writers for 124 column(s)
I0615 07:22:43.87539630614multi_column_writer.cc:103] Opened CFile writers for 124 column(s)
I0615 07:22:44.41842130614multi_column_writer.cc:103] Opened CFile writers for 124 column(s)
I0615 07:22:45.11438930614multi_column_writer.cc:103] Opened CFile writers for 124 column(s)
I0615 07:22:54.76256330614tablet.cc:1532] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Compaction: entering phase 2 (starting to duplicate updates in new rowsets)
I0615 07:22:54.77357230614tablet.cc:1587] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Compaction Phase 2: carrying over any updates which arrived during Phase 1
I0615 07:22:54.77359930614tablet.cc:1589] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Phase 2 snapshot: MvccSnapshot[committed={T|T < 6263071556616208384 or (T in {6263071556616208384})}]
I0615 07:22:55.18975730614tablet.cc:1631] T 6bdefb8c27764a0597dcf98ee1b450ba P 70f3e54fe0f3490cbf0371a6830a33a7:
Compaction successful on 82987 rows (123387929 bytes)
I0615 07:22:55.19142630614maintenance_manager.cc:491] Time spent running CompactRowSetsOp(6bdefb8c27764a0597dcf98ee1b450ba):
real 12.628suser 1.460ssys 0.410s
I0615 07:22:55.19148430614maintenance_manager.cc:497] P 70f3e54fe0f3490cbf0371a6830a33a7:
CompactRowSetsOp(6bdefb8c27764a0597dcf98ee1b450ba) metrics: {"cfile_cache_hit":812,"cfile_cache_hit_bytes":16840376,"cfile_cache_miss":2730,"cfile_cache_miss_bytes":251298442,"cfile_init":496,"data
dirs.queue_time_us":6646,"data dirs.run_cpu_time_us":2188,"data dirs.run_wall_time_us":101717,"fdatasync":315,"fdatasync_us":9617174,"lbm_read_time_us":1288971,"lbm_reads_1-10_ms":32,"lbm_reads_10-100_ms":41,"lbm_reads_lt_1ms":4641,"lbm_write_time_us":122520,"lbm_writes_lt_1ms":2799,"mutex_wait_us":25,"spinlock_wait_cycles":155264,"tcmalloc_contention_cycles":768,"thread_start_us":677,"threads_started":14,"wal-append.queue_time_us":300}


The flush_threshold_mb is set in the default value (1024). Wouldn't the flushed file size
be ~1GB?


I think increasing the initial RowSet size can reduce compactions and then reduce the impact
of other ongoing operations. It may also improve the flush performance. Is that right? If
so, how can I increase the RowSet size?


I'd be grateful if someone can make me clear about these!


Thanks,
Quanlong
Mime
View raw message