incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Exactly one wide row per node for a given CF?
Date Thu, 12 Dec 2013 03:08:57 GMT
>  Querying the table was fast. What I didn’t do was test the table under load, nor did
I try this in a multi-node cluster.
As the number of columns in a row increases so does the size of the column index which is
read as part of the read path. 

For background and comparisons of latency see http://thelastpickle.com/blog/2011/07/04/Cassandra-Query-Plans.html
 or my talk on performance at the SF summit last year http://thelastpickle.com/speaking/2012/08/08/Cassandra-Summit-SF.html
While the column index has been lifted to the -Index.db component AFAIK it must still be fully
loaded.

Larger rows take longer to go through compaction, tend to cause more JVM GC and have issue
during repair. See the in_memory_compaction_limit_in_mb comments in the yaml file. During
repair we detect differences in ranges of rows and stream them between the nodes. If you have
wide rows and a single column is our of sync we will create a new copy of that row on the
node, which must then be compacted. I’ve seen the load on nodes with very wide rows go down
by 150GB just by reducing the compaction settings. 

IMHO all things been equal rows in the few 10’s of MB work better. 

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 11/12/2013, at 2:41 am, Robert Wille <rwille@fold3.com> wrote:

> I have a question about this statement:
> 
> When rows get above a few 10’s  of MB things can slow down, when they get above 50
MB they can be a pain, when they get above 100MB it’s a warning sign. And when they get
above 1GB, well you you don’t want to know what happens then. 
> 
> I tested a data model that I created. Here’s the schema for the table in question:
> 
> CREATE TABLE bdn_index_pub (
> 	tree INT,
> 	pord INT,
> 	hpath VARCHAR,
> 	PRIMARY KEY (tree, pord)
> );
> 
> As a test, I inserted 100 million records. tree had the same value for every record,
and I had 100 million values for pord. hpath averaged about 50 characters in length. My understanding
is that all 100 million strings would have been stored in a single row, since they all had
the same value for the first component of the primary key. I didn’t look at the size of
the table, but it had to be several gigs (uncompressed). Contrary to what Aaron says, I do
want to know what happens, because I didn’t experience any issues with this table during
my test. Inserting was fast. The last batch of records inserted in approximately the same
amount of time as the first batch. Querying the table was fast. What I didn’t do was test
the table under load, nor did I try this in a multi-node cluster.
> 
> If this is bad, can somebody suggest a better pattern? This table was designed to support
a query like this: select hpath from bdn_index_pub where tree = :tree and pord >= :start
and pord <= :end. In my application, most trees will have less than a million records.
A handful will have 10’s of millions, and one of them will have 100 million.
> 
> If I need to break up my rows, my first instinct would be to divide each tree into blocks
of say 10,000 and change tree to a string that contains the tree and the block number. Something
like this:
> 
> 17:0, 0, ‘/’
> …
> 17:0, 9999, ’/a/b/c’
> 17:1,10000, ‘/a/b/d’
> …
> 
> I’d then need to issue an extra query for ranges that crossed block boundaries.
> 
> Any suggestions on a better pattern?
> 
> Thanks
> 
> Robert
> 
> From: Aaron Morton <aaron@thelastpickle.com>
> Reply-To: <user@cassandra.apache.org>
> Date: Tuesday, December 10, 2013 at 12:33 AM
> To: Cassandra User <user@cassandra.apache.org>
> Subject: Re: Exactly one wide row per node for a given CF?
> 
>>> But this becomes troublesome if I add or remove nodes. What effectively I want
is to partition on the unique id of the record modulus N (id % N; where N is the number of
nodes).
> This is exactly the problem consistent hashing (used by cassandra) is designed to solve.
If you hash the key and modulo the number of nodes, adding and removing nodes requires a lot
of data to move. 
> 
>>> I want to be able to randomly distribute a large set of records but keep them
clustered in one wide row per node.
> Sounds like you should revisit your data modelling, this is a pretty well known anti
pattern. 
> 
> When rows get above a few 10’s  of MB things can slow down, when they get above 50
MB they can be a pain, when they get above 100MB it’s a warning sign. And when they get
above 1GB, well you you don’t want to know what happens then. 
> 
> It’s a bad idea and you should take another look at the data model. If you have to
do it, you can try the ByteOrderedPartitioner which uses the row key as a token, given you
total control of the row placement. 
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 4/12/2013, at 8:32 pm, Vivek Mishra <mishra.vivs@gmail.com> wrote:
> 
>> So Basically you want to create a cluster of multiple unique keys, but data which
belongs to one unique should be colocated. correct?
>> 
>> -Vivek
>> 
>> 
>> On Tue, Dec 3, 2013 at 10:39 AM, onlinespending <onlinespending@gmail.com>
wrote:
>>> Subject says it all. I want to be able to randomly distribute a large set of
records but keep them clustered in one wide row per node.
>>> 
>>> As an example, lets say I’ve got a collection of about 1 million records each
with a unique id. If I just go ahead and set the primary key (and therefore the partition
key) as the unique id, I’ll get very good random distribution across my server cluster.
However, each record will be its own row. I’d like to have each record belong to one large
wide row (per server node) so I can have them sorted or clustered on some other column.
>>> 
>>> If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 5
at the time of creation and have the partition key set to this value. But this becomes troublesome
if I add or remove nodes. What effectively I want is to partition on the unique id of the
record modulus N (id % N; where N is the number of nodes).
>>> 
>>> I have to imagine there’s a mechanism in Cassandra to simply randomize the
partitioning without even using a key (and then clustering on some column).
>>> 
>>> Thanks for any help.
>> 
> 


Mime
View raw message