cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: Read Perf
Date Tue, 26 Feb 2013 15:32:17 GMT
In that case, make sure you don't plan on going into the millions or test
the limit as I pretty sure it can't go above 10 million. (from previous
posts on this list).


On 2/26/13 8:23 AM, "Kanwar Sangha" <> wrote:

>Thanks. For our case, the no of rows will more or less be the same. The
>only thing which changes is the columns and they keep getting added.
>-----Original Message-----
>From: Hiller, Dean []
>Sent: 26 February 2013 09:21
>Subject: Re: Read Perf
>To find stuff on disk, there is a bloomfilter for each file in memory.
>On the docs, 1 billion rows has 2Gig of RAM, so it really will have a
>huge dependency on your number of rows.  As you get more rows, you may
>need to modify the bloomfilter false positive to use less RAM but that
>means slower reads.  Ie. As you add more rows, you will have slower reads
>on a single machine.
>We hit the RAM limit on one machine with 1 billion rows so we are in the
>process of tweaking the ratio of 0.000744(the default) to 0.1 to give us
>more time to solve.  Since we see no I/o load on our machines(or rather
>extremely little), we plan on moving to leveled compaction where 0.1 is
>the default in new releases and size tiered new default I think is 0.01.
>Ie. If you store more data per row, this is not an issue as much but
>still something to consider.  (Also, rows have a limit I think as well on
>data size but not sure what that is.  I know the column limit on a row is
>in the millions, somewhere lower than 10 million).
>From: Kanwar Sangha <<>>
>Reply-To: "<>"
>Date: Monday, February 25, 2013 8:31 PM
>To: "<>"
>Subject: Read Perf
>Hi - I am doing a performance run using modified YCSB client and was able
>to populate 8TB on a node and then ran some read workloads. I am seeing
>an average TPS of 930 ops/sec for random reads. There is no key cache/row
>cache. Question -
>Will the read TPS degrade if the data size increases to say 20 TB , 50
>TB, 100 TB ? If I understand correctly, the read should remain constant
>irrespective of the data size since we eventually have sorted SStables
>and binary search would be done on the index filter to find the row ?

View raw message