I was playing with a single-node Cassandra installation when discovered that a request like [SELECT COUNT(*) FROM CF] seems to load the entire dataset of CF into RAM. I am not sure is it expected to behave this way or not. I'd expect it to iterate through the entire set of rows rather than collect values in memory.
create table big_table (
k int primary key,
create index on big_table (idx);
I filled the table above with 400 random rows, where column 'val' was written with random strings of 10MB each. Thus I came up roughly with 4GB of data.
At this point everything is fine, response delays are pretty good and memory consumption is adequate.
Things go bad with a counting request like [SELECT COUNT(1) FROM big_table] - that makes the database die with OOM. However, it is possible to fetch any column except the huge one: [SELECT k FROM big_table] - this works okay.
As far as I understand, a counting request works roughly the same way as [SELECT * FROM] with only difference that it doesn't return any data back. Is my reasoning correct?
Thanks in advance,