A single sstable is using 274G. In addition to data usage imbalance, we see hot spots as well. With static fields in particular, and CFs that don't change much, you'll get CFs that end up compacting into fewer number of large sstables. With most of the data for a CF being in one sstable and on one data volume, a single data volume then becomes a hotspot for reads on that CF. Cassandra tries to minimize the number of sstables a row will be written across, but in particular after some compaction on an CFs that are rarely updated, most of the data for a CF can end up in a single sstable, and stables aren't split aross data volumes. Thus a single volume will be a hot-spot for access to that CF in a JBOD setup as Cassandra does not effectively distribute data across individual volumes in all circumstances.
There may be tuning which would help this, but it's specific to JBOD and not somewhat that you would have to worry about in a single data volume setup, ie RAID0. With a RAID0, the downside of course, is that losing a single member disk to the RAID0 takes the node down. The upside is you don't have to worry about the imbalance of both I/O and data footprint across individual volumes.
Unlike HDFS, Ceph, and RAID for that matter, where you're dealing with maximum fixed sizes blocks/stripes that are then distributed at a granular level across the JBOD volumes, Cassandra is dealing with uncapped, low granularity, variable sized sstable data files which it attempts to distribute across JBOD volumes making JBOD far from ideal. Frankly, it's hard for me to imagine any columnar data store doing JBOD well.
Hi,I am currently benchmarking Cassandra with three machines, and on each machine I am seeing an unbalanced distribution of data among the data directories (1 per disk).I am concerned that this affects my write performance, is there anything that I can make the distribution be more even? Would raid0 be my best option?Details:3 machines, each have 24 cores, 64GB of RAM, 7 SSDs of 500GB each.Commitlog is on a separate disk, cassandra.yaml configured according to Datastax' guide on cassandra.yaml.Total size of data is about 2TB, 14B records, all unique. Replication factor of 1.Thanks,Soerian