cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <>
Subject Re: Cassandra cluster HW spec (commit log directory vs data file directory)
Date Mon, 31 Oct 2011 02:31:00 GMT
On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet <> wrote:
> On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean <>
> wrote:
>> Hey Chris,
>>  Thanks for sharing all  the info.
>>  I have few questions:
>>  1. What are you doing with so much memory :) ? How much of it do you
>> allocate for heap ?
> max heap is 12GB. we use the rest for cache. we run memcache on each node
> and allocate the remaining to that.

Is this using off heap cache of Cassandra?

>>  2. What your network speed ? Do you use trunks ? Do you have a dedicated
>> VLAN for gossip/store traffic ?
> No dedicated VLAN for gossip. We run at 2Gb/s. We have bonded NIC's.
>> Cheers,
>> Sorin
>> On Sun, Oct 30, 2011 at 5:00 AM, Chris Goffinet <>
>> wrote:
>>> RE: RAID0 Recommendation
>>> Cassandra supports multiple data file directories. Because we do
>>> compactions, it's just much easier to deal with (1) data file directory that
>>> is stripped across all disks as 1 volume (RAID0). There are other ways to
>>> accomplish this though. At Twitter we use software raid (RAID0 & RAID10).
>>> We own the physical hardware and have found that even with hardware raid,
>>> software raid in Linux actually faster. The reason being is:
>>> We have found that using far-copies is much faster over near-copies. We
>>> set the i/o scheduler to noop at the moment. We might move back to CFQ with
>>> more tuning in the future.
>>> We use RAID10 for cases where we need better disk performance if we are
>>> hitting the disk often, sacrificing storage. We initially thought RAID0
>>> should be faster over RAID10 until we found out about the near vs far
>>> layouts.
>>> RE: Hardware
>>> This is going to depend on how well your automated infrastructure is, but
>>> we chose the path of finding the cheapest servers we could get from
>>> Dell/HP/etc. 8/12 cores, 72gb memory per node, 2TB/3TB, 2.5".
>>> We are in the process of making changes to our servers, I'll report back
>>> in when we have more details to share.
>>> I wouldn't recommend 75 CFs. It could work but just seems too complex.
>>> Another recommendation for clusters, always go big. You will be thankful
>>> in the future for this. Even if you can do this on 3-6 nodes, go much larger
>>> for future expansion. If you own your hardware and racks, I recommend making
>>> sure to size out the rack diversity and # of nodes per rack. Also take into
>>> account the replication factor when doing this. RF=3, should be min of 3
>>> racks, and # of nodes per rack should be divisible by the replication
>>> factor. This has worked out pretty well for us. Our biggest problems today
>>> are adding 100s of nodes to existing clusters at once. I'm not sure how many
>>> other companies are having this problem, but it's certainly on our radar to
>>> improve, if you get to that point :)
>>> On Tue, Oct 25, 2011 at 5:23 AM, Alexandru Sicoe <>
>>> wrote:
>>>> Hi everyone,
>>>> I am currently in the process of writing a hardware proposal for a
>>>> Cassandra cluster for storing a lot of monitoring time series data. My
>>>> workload is write intensive and my data set is extremely varied in types
>>>> variables and insertion rate for these variables (I will have to handle an
>>>> order of 2 million variables coming in, each at very different rates - the
>>>> majority of them will come at very low rates but there are many that will
>>>> come at higher rates constant rates and a few coming in with huge spikes
>>>> rates). These variables correspond to all basic C++ types and arrays of
>>>> these types. The highest insertion rates are received for basic types, out
>>>> of which U32 variables seem to be the most prevalent (e.g. I recorded 2
>>>> million U32 vars were inserted in 8 mins of operation while 600.000 doubles
>>>> and 170.000 strings were inserted during the same time. Note this
>>>> measurement was only for a subset of the total data currently taken in).
>>>> At the moment I am partitioning the data in Cassandra in 75 CFs (each CF
>>>> corresponds to a logical partitioning of the set of variables mentioned
>>>> before - but this partitioning is not related with the amount of data or
>>>> is somewhat random). These 75 CFs account for ~1 million of the
>>>> variables I need to store. I have a 3 node Cassandra 0.8.5 cluster (each
>>>> node is a 4 real core with 4 GB RAM and split commit log directory and data
>>>> file directory between two RAID arrays with HDDs). I can handle the load
>>>> this configuration but the average CPU usage of the Cassandra nodes is
>>>> slightly above 50%. As I will need to add 12 more CFs (corresponding to
>>>> another ~ 1 million variables) plus potentially other data later, it is
>>>> clear that I need better hardware (also for the retrieval part).
>>>> I am looking at Dell servers (Power Edge etc)
>>>> Questions:
>>>> 1. Is anyone using Dell HW for their Cassandra clusters? How do they
>>>> behave? Anybody care to share their configurations or tips for buying, what
>>>> to avoid etc?
>>>> 2. Obviously I am going to keep to the advice on the
>>>> and split the commmitlog
>>>> and data on separate disks. I was going to use SSD for commitlog but then
>>>> did some more research and found out that it doesn't make sense to use SSDs
>>>> for sequential appends because it won't have a performance advantage with
>>>> respect to rotational media. So I am going to use rotational disk for the
>>>> commit log and an SSD for data. Does this make sense?
>>>> 3. What's the best way to find out how big my commitlog disk and my data
>>>> disk has to be? The Cassandra hardware page says the Commitlog disk
>>>> shouldn't be big but still I need to choose a size!
>>>> 4. I also noticed RAID 0 configuration is recommended for the data file
>>>> directory. Can anyone explain why?
>>>> Sorry for the huge email.....
>>>> Cheers,
>>>> Alex

View raw message