kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Martynov <mr.xk...@gmail.com>
Subject WAL size estimation
Date Wed, 19 Jun 2019 06:11:45 GMT
Hi guys!

We want to buy SSDs for TServers WALs for our cluster. I'm working on
capacity estimation for this SSDs using "Getting Started with Kudu" book,
Chapter 4, Write-Ahead Log (
https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html
<https://www.oreilly.com/library/view/getting-started-with/9781491980248/ch04.html#idm139738927926240>
).

NB: we use default Kudu WAL configuration settings.

There is a formula for worse-case:
8 MB/segment * 80 max segments * 2000 tablets = 1,280,000 MB = ~1.3 TB

So, this formula takes into account only segment files. But in our cluster,
I see that every segment file has >= 1 corresponding index files. And every
index file actually larger than segment file.

Numbers from one of our nodes.
WALs count:
$ ls /mnt/data01/kudu-tserver-wal/wals/ | wc -l
711

Overall WAL size:
$ du -d 0 -h /mnt/data01/kudu-tserver-wal/
13G     /mnt/data01/kudu-tserver-wal/

Size of all segment files:
$ find /mnt/data01/kudu-tserver-wal/ -type f -name 'wal-*' -exec du -ch {}
+ | grep total$
6.1G    total

Size of all index files:
$ find /mnt/data01/kudu-tserver-wal/ -type f -name 'index*' -exec du -ch {}
+ | grep total$
6.5G    total

So I have questions.

1. How can I estimate the size of index files?
Looks like in our cluster size of index files approximately equal to size
segment files.

2. There is some WALs with more than one index files. For example:
$ ls -lh /mnt/data01/kudu-tserver-wal/wals/779a382ea4e6464aa80ea398070a391f/
total 296M
-rw-r--r-- 1 root root  23M Jun 18 21:31 index.000000108
-rw-r--r-- 1 root root  23M Jun 18 21:41 index.000000109
-rw-r--r-- 1 root root  23M Jun 18 21:52 index.000000110
-rw-r--r-- 1 root root  23M Jun 18 22:10 index.000000111
-rw-r--r-- 1 root root  23M Jun 18 22:22 index.000000112
-rw-r--r-- 1 root root  23M Jun 18 22:35 index.000000113
-rw-r--r-- 1 root root  23M Jun 18 22:48 index.000000114
-rw-r--r-- 1 root root  23M Jun 18 23:01 index.000000115
-rw-r--r-- 1 root root  23M Jun 18 23:14 index.000000116
-rw-r--r-- 1 root root  23M Jun 18 23:27 index.000000117
-rw-r--r-- 1 root root  23M Jun 18 23:40 index.000000118
-rw-r--r-- 1 root root  23M Jun 18 23:52 index.000000119
-rw-r--r-- 1 root root  23M Jun 19 01:13 index.000000120
-rw-r--r-- 1 root root 8.0M Jun 19 01:13 wal-000007799

Is this a normal situation?

3. Not a question. Please, consider adding documentation about the
estimation of WAL storage. Also, I can't found any mentions about index
files, except here
https://kudu.apache.org/docs/scaling_guide.html#file_descriptors.

Thanks!

-- 
with best regards, Pavel Martynov

Mime
View raw message