Hi guys!

We want to buy SSDs for TServers WALs for our cluster. I'm working on capacity estimation for this SSDs using "Getting Started with Kudu" book, Chapter 4, Write-Ahead Log (https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html).

NB: we use default Kudu WAL configuration settings.

There is a formula for worse-case:
8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 MB = ~1.3 TB

So, this formula takes into account only segment files. But in our cluster, I see that every segment file has >= 1 corresponding index files. And every index file actually larger than segment file.

Numbers from one of our nodes.
WALs count:
$ ls /mnt/data01/kudu-tserver-wal/wals/ | wc -l

Overall WAL size:
$ du -d 0 -h /mnt/data01/kudu-tserver-wal/
13G     /mnt/data01/kudu-tserver-wal/

Size of all segment files:
$ find /mnt/data01/kudu-tserver-wal/ -type f -name 'wal-*' -exec du -ch {} + | grep total$
6.1G    total

Size of all index files:
$ find /mnt/data01/kudu-tserver-wal/ -type f -name 'index*' -exec du -ch {} + | grep total$
6.5G    total

So I have questions.

1. How can I estimate the size of index files?
Looks like in our cluster size of index files approximately equal to size segment files.

2. There is some WALs with more than one index files. For example:
$ ls -lh /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f/
total 296M
-rw-r--r-- 1 root root  23M Jun 18 21:31 index.000000108
-rw-r--r-- 1 root root  23M Jun 18 21:41 index.000000109
-rw-r--r-- 1 root root  23M Jun 18 21:52 index.000000110
-rw-r--r-- 1 root root  23M Jun 18 22:10 index.000000111
-rw-r--r-- 1 root root  23M Jun 18 22:22 index.000000112
-rw-r--r-- 1 root root  23M Jun 18 22:35 index.000000113
-rw-r--r-- 1 root root  23M Jun 18 22:48 index.000000114
-rw-r--r-- 1 root root  23M Jun 18 23:01 index.000000115
-rw-r--r-- 1 root root  23M Jun 18 23:14 index.000000116
-rw-r--r-- 1 root root  23M Jun 18 23:27 index.000000117
-rw-r--r-- 1 root root  23M Jun 18 23:40 index.000000118
-rw-r--r-- 1 root root  23M Jun 18 23:52 index.000000119
-rw-r--r-- 1 root root  23M Jun 19 01:13 index.000000120
-rw-r--r-- 1 root root 8.0M Jun 19 01:13 wal-000007799

Is this a normal situation?

3. Not a question. Please, consider adding documentation about the estimation of WAL storage. Also, I can't found any mentions about index files, except here https://kudu.apache.org/docs/scaling_guide.html#file_descriptors.


with best regards, Pavel Martynov