incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Cold boot performance problems
Date Fri, 08 Oct 2010 17:53:17 GMT
Two things that can help:

In 0.6.5, enable the dynamic snitch with

-Dcassandra.dynamic_snitch_enabled=true
-Dcassandra.dynamic_snitch=cassandra.dynamic_snitch_enabled

which if you are doing a rolling restart will let other nodes route
around the slow node (at CL.ONE) until it's warmed up (by the read
repairs in the background).

In 0.6.6, we've added save/load of the Cassandra caches:
https://issues.apache.org/jira/browse/CASSANDRA-1417

Finally: we recommend using raid0 ephemeral disks on EC2 with L or XL
instance sizes for better i/o performance.  (Corey Hulen has some
numbers at http://www.coreyhulen.org/?p=326.)

On Fri, Oct 8, 2010 at 12:36 PM, Jason Horman <jhorman@gmail.com> wrote:
> We are experiencing very slow performance on Amazon EC2 after a cold boot.
> 10-20 tps. After the cache is primed things are much better, but it would be
> nice if users who aren't in cache didn't experience such slow performance.
> Before dumping a bunch of config I just had some general questions.
>
> We are using uuid keys, 40m of them and the random partitioner. Typical
> access pattern is reading 200-300 keys in a single web request. Are uuid
> keys going to be painful b/c they are so random. Should we be using less
> random keys, maybe with a shard prefix (01-80), and make sure that our
> tokens group user data together on the cluster (via the order preserving
> partitioner)
> Would the order preserving partitioner be a better option in the sense that
> it would group a single users data to a single set of machines (if we added
> a prefix to the uuid)?
> Is there any benefit to doing sharding of our own via Keyspaces. 01-80
> keyspaces to split up the data files. (we already have 80 mysql shards we
> are migrating from, so doing this wouldn't be terrible implementation wise)
> Should a goal be to get the data/index files as small as possible. Is there
> a size at which they become problematic? (Amazon EC2/EBS fyi)
>
> Via more servers
> Via more cassandra instances on the same server
> Via manual sharding by keyspace
> Via manual sharding by columnfamily
>
> Thanks,
> --
> -jason horman
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message