cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitri Dmitrienko <ddmit...@gmail.com>
Subject Problem with performance, memory consumption, and RLIMIT_MEMLOCK
Date Sun, 16 Nov 2014 17:09:05 GMT
Hi,
I have a very simple table in cassandra that contains only three columns:
id, time and blob with data. I added 1M rows of data and now the database
is about 12GB on disk.
1M is only part of data I want to store in the database, it's necessary to
synchronize this table with external source. In order to do this, I have to
read id and time columns of all the rows and compare them with what I see
in the external source and insert/update/delete the rows where I see a
difference.
So, I'm trying to fetch id and time columns from cassandra. All of sudden
in all 100% my attempts, server hangs for ~ 1minute, while doing so it
loads >100% CPU, then abnormally terminates with error saying I have to run
cassandra as root or increase RLIMIT_MEMLOCK.
I increased RLIMIT_MEMLOCK to 1GB and seems it still is not sufficient.
It seems cassandra tries to read and lock whole the table in memory,
ignoring the fact that I need only two tiny columns (~12MB of data).

This is how it works when I use the latest cpp-driver.
With cqlsh it works differently -- it show first page of data almost
immediately, without any sensible delay.
Is there a way to have cpp-driver working like cqlsh? I'd like to have data
sent to the client immediately upon availability without any attempts to
lock huge chunks of virtual memory.
My platform is 64bit linux (centos) with all necessary updates installed,
openjdk. I also tried macosx with oracle jdk. In this case I don't get
RLIMIT_MEMLOCK, but regular out of memory error in system.log, although I
provided server with sufficiently large heap, as recommended, 8GB.

Mime
View raw message