cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Horman <jhor...@gmail.com>
Subject Re: Cold boot performance problems
Date Fri, 08 Oct 2010 19:11:23 GMT
We are currently using EBS with 4 volumes striped with LVM. Wow, we
didn't realize you could raid the ephemeral disks. I thought the
opinion for Cassandra though was that the ephemeral disks were
dangerous. We have lost of a few machines over the past year, but
replicas hopefully prevent real trouble.

How about the sharding strategies? Is it worth it to investigate
sharding out via multiple keyspaces? Would order preserving
partitioning help group indexes better for users?

On Fri, Oct 8, 2010 at 1:53 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> Two things that can help:
>
> In 0.6.5, enable the dynamic snitch with
>
> -Dcassandra.dynamic_snitch_enabled=true
> -Dcassandra.dynamic_snitch=cassandra.dynamic_snitch_enabled
>
> which if you are doing a rolling restart will let other nodes route
> around the slow node (at CL.ONE) until it's warmed up (by the read
> repairs in the background).
>
> In 0.6.6, we've added save/load of the Cassandra caches:
> https://issues.apache.org/jira/browse/CASSANDRA-1417
>
> Finally: we recommend using raid0 ephemeral disks on EC2 with L or XL
> instance sizes for better i/o performance.  (Corey Hulen has some
> numbers at http://www.coreyhulen.org/?p=326.)
>
> On Fri, Oct 8, 2010 at 12:36 PM, Jason Horman <jhorman@gmail.com> wrote:
>> We are experiencing very slow performance on Amazon EC2 after a cold boot.
>> 10-20 tps. After the cache is primed things are much better, but it would be
>> nice if users who aren't in cache didn't experience such slow performance.
>> Before dumping a bunch of config I just had some general questions.
>>
>> We are using uuid keys, 40m of them and the random partitioner. Typical
>> access pattern is reading 200-300 keys in a single web request. Are uuid
>> keys going to be painful b/c they are so random. Should we be using less
>> random keys, maybe with a shard prefix (01-80), and make sure that our
>> tokens group user data together on the cluster (via the order preserving
>> partitioner)
>> Would the order preserving partitioner be a better option in the sense that
>> it would group a single users data to a single set of machines (if we added
>> a prefix to the uuid)?
>> Is there any benefit to doing sharding of our own via Keyspaces. 01-80
>> keyspaces to split up the data files. (we already have 80 mysql shards we
>> are migrating from, so doing this wouldn't be terrible implementation wise)
>> Should a goal be to get the data/index files as small as possible. Is there
>> a size at which they become problematic? (Amazon EC2/EBS fyi)
>>
>> Via more servers
>> Via more cassandra instances on the same server
>> Via manual sharding by keyspace
>> Via manual sharding by columnfamily
>>
>> Thanks,
>> --
>> -jason horman
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>



-- 
-jason

Mime
View raw message