incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Goffinet ...@chrisgoffinet.com>
Subject Re: Cassandra cluster HW spec (commit log directory vs data file directory)
Date Mon, 31 Oct 2011 01:53:35 GMT
On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean <sorin.julean@gmail.com>wrote:

> Hey Chris,
>
>  Thanks for sharing all  the info.
>  I have few questions:
>  1. What are you doing with so much memory :) ? How much of it do you
> allocate for heap ?
>

max heap is 12GB. we use the rest for cache. we run memcache on each node
and allocate the remaining to that.


>  2. What your network speed ? Do you use trunks ? Do you have a dedicated
> VLAN for gossip/store traffic ?
>
> No dedicated VLAN for gossip. We run at 2Gb/s. We have bonded NIC's.



> Cheers,
> Sorin
>
>
> On Sun, Oct 30, 2011 at 5:00 AM, Chris Goffinet <cg@chrisgoffinet.com>wrote:
>
>> RE: RAID0 Recommendation
>>
>> Cassandra supports multiple data file directories. Because we do
>> compactions, it's just much easier to deal with (1) data file directory
>> that is stripped across all disks as 1 volume (RAID0). There are other ways
>> to accomplish this though. At Twitter we use software raid (RAID0 & RAID10).
>>
>> We own the physical hardware and have found that even with hardware raid,
>> software raid in Linux actually faster. The reason being is:
>>
>> http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10
>>
>> We have found that using far-copies is much faster over near-copies. We
>> set the i/o scheduler to noop at the moment. We might move back to CFQ with
>> more tuning in the future.
>>
>> We use RAID10 for cases where we need better disk performance if we are
>> hitting the disk often, sacrificing storage. We initially thought RAID0
>> should be faster over RAID10 until we found out about the near vs far
>> layouts.
>>
>> RE: Hardware
>>
>> This is going to depend on how well your automated infrastructure is, but
>> we chose the path of finding the cheapest servers we could get from
>> Dell/HP/etc. 8/12 cores, 72gb memory per node, 2TB/3TB, 2.5".
>>
>> We are in the process of making changes to our servers, I'll report back
>> in when we have more details to share.
>>
>> I wouldn't recommend 75 CFs. It could work but just seems too complex.
>>
>> Another recommendation for clusters, always go big. You will be thankful
>> in the future for this. Even if you can do this on 3-6 nodes, go much
>> larger for future expansion. If you own your hardware and racks, I
>> recommend making sure to size out the rack diversity and # of nodes per
>> rack. Also take into account the replication factor when doing this. RF=3,
>> should be min of 3 racks, and # of nodes per rack should be divisible by
>> the replication factor. This has worked out pretty well for us. Our biggest
>> problems today are adding 100s of nodes to existing clusters at once. I'm
>> not sure how many other companies are having this problem, but it's
>> certainly on our radar to improve, if you get to that point :)
>>
>>
>> On Tue, Oct 25, 2011 at 5:23 AM, Alexandru Sicoe <adsicoe@gmail.com>wrote:
>>
>>> Hi everyone,
>>>
>>> I am currently in the process of writing a hardware proposal for a
>>> Cassandra cluster for storing a lot of monitoring time series data. My
>>> workload is write intensive and my data set is extremely varied in types of
>>> variables and insertion rate for these variables (I will have to handle an
>>> order of 2 million variables coming in, each at very different rates - the
>>> majority of them will come at very low rates but there are many that will
>>> come at higher rates constant rates and a few coming in with huge spikes in
>>> rates). These variables correspond to all basic C++ types and arrays of
>>> these types. The highest insertion rates are received for basic types, out
>>> of which U32 variables seem to be the most prevalent (e.g. I recorded 2
>>> million U32 vars were inserted in 8 mins of operation while 600.000 doubles
>>> and 170.000 strings were inserted during the same time. Note this
>>> measurement was only for a subset of the total data currently taken in).
>>>
>>> At the moment I am partitioning the data in Cassandra in 75 CFs (each CF
>>> corresponds to a logical partitioning of the set of variables mentioned
>>> before - but this partitioning is not related with the amount of data or
>>> rates...it is somewhat random). These 75 CFs account for ~1 million of the
>>> variables I need to store. I have a 3 node Cassandra 0.8.5 cluster (each
>>> node is a 4 real core with 4 GB RAM and split commit log directory and data
>>> file directory between two RAID arrays with HDDs). I can handle the load in
>>> this configuration but the average CPU usage of the Cassandra nodes is
>>> slightly above 50%. As I will need to add 12 more CFs (corresponding to
>>> another ~ 1 million variables) plus potentially other data later, it is
>>> clear that I need better hardware (also for the retrieval part).
>>>
>>> I am looking at Dell servers (Power Edge etc)
>>>
>>> Questions:
>>>
>>> 1. Is anyone using Dell HW for their Cassandra clusters? How do they
>>> behave? Anybody care to share their configurations or tips for buying, what
>>> to avoid etc?
>>>
>>> 2. Obviously I am going to keep to the advice on the
>>> http://wiki.apache.org/cassandra/CassandraHardware and split the
>>> commmitlog and data on separate disks. I was going to use SSD for commitlog
>>> but then did some more research and found out that it doesn't make sense to
>>> use SSDs for sequential appends because it won't have a performance
>>> advantage with respect to rotational media. So I am going to use rotational
>>> disk for the commit log and an SSD for data. Does this make sense?
>>>
>>> 3. What's the best way to find out how big my commitlog disk and my data
>>> disk has to be? The Cassandra hardware page says the Commitlog disk
>>> shouldn't be big but still I need to choose a size!
>>>
>>> 4. I also noticed RAID 0 configuration is recommended for the data file
>>> directory. Can anyone explain why?
>>>
>>> Sorry for the huge email.....
>>>
>>> Cheers,
>>> Alex
>>>
>>
>>
>

Mime
View raw message