I am trying to figure out why the following behavior happened. Any help would be highly appreciated.
I ran a hadoop process that reads a CSV file and writtes data to Cassandra. For about 1 h, the process ran fine, but taking about 100% of CPU. After 1 h, my hadoop process started to have its connection attempts refused by cassandra, as shown bellow.
Since them, it has been taking 100% of the machine IO. It has been 2 h already since the IO is 100% on the machine running Cassandra.
I am running Cassandra under Amazon EBS, which is slow, but I didn't think it would be that slow. Just wondering, is it normal for Cassandra to use a high amount of CPU? I am guessing all the writes were going to the memtables and when it was time to flush the server went down.
Makes sense? I am still learning Cassandra as it's the first time I use it in production, so I am not sure if I am missing something really basic here.
2013-02-01 16:44:43,741 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-18): EXCEPTION:PoolTimeoutException: [host=(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=nosql1.s1mbi0se.com.br(10.84.65.108):9160, latency=5005(5005), attempts=1] Timed out waiting for connection
2013-02-01 16:44:43,743 ERROR com.s1mbi0se.dmp.input.service.InputService (Thread-15): EXCEPTION:PoolTimeoutException:
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr