I have set up C* in a very limited environment: 3 VMs at digitalocean with 2GB RAM and 40GB SSDs, so my expectations about overall performance are low.
Keyspace uses replication level of 2.
I am loading 1.5 Mio rows (each 60 columns of a mix of numbers and small texts, 300.000 wide rows effektively) in a quite 'agressive' way, using java-driver and async update statements.
After a while of importing data, I start seeing timeouts reported by the driver:
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency ONE (1 replica were required but only 0 acknowledged the write
and then later, host-unavailability exceptions:
com.datastax.driver.core.exceptions.UnavailableException: Not enough replica available for query at consistency ONE (1 required but only 0 alive).
Looking at the 3 hosts, I see two C*s went down - which explains that I still see some writes succeeding (that must be the one host left, satisfying the consitency level ONE).
The logs tell me AFAIU that the servers shutdown due to reaching the heap size limit.
I am irritated by the fact that the instances (it seems) shut themselves down instead of limiting their amount of work. I understand that I need to tweak the configuration and likely get more RAM, but still, I would actually be satisfied with reduced service (and likely more timeouts in the client). Right now it looks as if I would have to slow down the client 'artificially' to prevent the loss of hosts - does that make sense?
Can anyone explain whether this is intended behavior, meaning I'll just have to accept the self-shutdown of the hosts? Or alternatively, what data I should collect to investigate the cause further?