incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Goffinet ...@chrisgoffinet.com>
Subject Re: Cassandra cluster HW spec (commit log directory vs data file directory)
Date Mon, 31 Oct 2011 04:47:09 GMT
No. We built a pluggable cache provider for memcache.

On Sun, Oct 30, 2011 at 7:31 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:

> On Sun, Oct 30, 2011 at 6:53 PM, Chris Goffinet <cg@chrisgoffinet.com>
> wrote:
> >
> >
> > On Sun, Oct 30, 2011 at 3:34 PM, Sorin Julean <sorin.julean@gmail.com>
> > wrote:
> >>
> >> Hey Chris,
> >>
> >>  Thanks for sharing all  the info.
> >>  I have few questions:
> >>  1. What are you doing with so much memory :) ? How much of it do you
> >> allocate for heap ?
> >
> > max heap is 12GB. we use the rest for cache. we run memcache on each node
> > and allocate the remaining to that.
>
> Is this using off heap cache of Cassandra?
>
> >
> >>
> >>  2. What your network speed ? Do you use trunks ? Do you have a
> dedicated
> >> VLAN for gossip/store traffic ?
> >>
> > No dedicated VLAN for gossip. We run at 2Gb/s. We have bonded NIC's.
> >
> >>
> >> Cheers,
> >> Sorin
> >>
> >>
> >> On Sun, Oct 30, 2011 at 5:00 AM, Chris Goffinet <cg@chrisgoffinet.com>
> >> wrote:
> >>>
> >>> RE: RAID0 Recommendation
> >>> Cassandra supports multiple data file directories. Because we do
> >>> compactions, it's just much easier to deal with (1) data file
> directory that
> >>> is stripped across all disks as 1 volume (RAID0). There are other ways
> to
> >>> accomplish this though. At Twitter we use software raid (RAID0 &
> RAID10).
> >>> We own the physical hardware and have found that even with hardware
> raid,
> >>> software raid in Linux actually faster. The reason being is:
> >>> http://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10
> >>> We have found that using far-copies is much faster over near-copies. We
> >>> set the i/o scheduler to noop at the moment. We might move back to CFQ
> with
> >>> more tuning in the future.
> >>> We use RAID10 for cases where we need better disk performance if we are
> >>> hitting the disk often, sacrificing storage. We initially thought RAID0
> >>> should be faster over RAID10 until we found out about the near vs far
> >>> layouts.
> >>> RE: Hardware
> >>> This is going to depend on how well your automated infrastructure is,
> but
> >>> we chose the path of finding the cheapest servers we could get from
> >>> Dell/HP/etc. 8/12 cores, 72gb memory per node, 2TB/3TB, 2.5".
> >>> We are in the process of making changes to our servers, I'll report
> back
> >>> in when we have more details to share.
> >>> I wouldn't recommend 75 CFs. It could work but just seems too complex.
> >>> Another recommendation for clusters, always go big. You will be
> thankful
> >>> in the future for this. Even if you can do this on 3-6 nodes, go much
> larger
> >>> for future expansion. If you own your hardware and racks, I recommend
> making
> >>> sure to size out the rack diversity and # of nodes per rack. Also take
> into
> >>> account the replication factor when doing this. RF=3, should be min of
> 3
> >>> racks, and # of nodes per rack should be divisible by the replication
> >>> factor. This has worked out pretty well for us. Our biggest problems
> today
> >>> are adding 100s of nodes to existing clusters at once. I'm not sure
> how many
> >>> other companies are having this problem, but it's certainly on our
> radar to
> >>> improve, if you get to that point :)
> >>>
> >>> On Tue, Oct 25, 2011 at 5:23 AM, Alexandru Sicoe <adsicoe@gmail.com>
> >>> wrote:
> >>>>
> >>>> Hi everyone,
> >>>>
> >>>> I am currently in the process of writing a hardware proposal for a
> >>>> Cassandra cluster for storing a lot of monitoring time series data.
My
> >>>> workload is write intensive and my data set is extremely varied in
> types of
> >>>> variables and insertion rate for these variables (I will have to
> handle an
> >>>> order of 2 million variables coming in, each at very different rates
> - the
> >>>> majority of them will come at very low rates but there are many that
> will
> >>>> come at higher rates constant rates and a few coming in with huge
> spikes in
> >>>> rates). These variables correspond to all basic C++ types and arrays
> of
> >>>> these types. The highest insertion rates are received for basic
> types, out
> >>>> of which U32 variables seem to be the most prevalent (e.g. I recorded
> 2
> >>>> million U32 vars were inserted in 8 mins of operation while 600.000
> doubles
> >>>> and 170.000 strings were inserted during the same time. Note this
> >>>> measurement was only for a subset of the total data currently taken
> in).
> >>>>
> >>>> At the moment I am partitioning the data in Cassandra in 75 CFs (each
> CF
> >>>> corresponds to a logical partitioning of the set of variables
> mentioned
> >>>> before - but this partitioning is not related with the amount of data
> or
> >>>> rates...it is somewhat random). These 75 CFs account for ~1 million
> of the
> >>>> variables I need to store. I have a 3 node Cassandra 0.8.5 cluster
> (each
> >>>> node is a 4 real core with 4 GB RAM and split commit log directory
> and data
> >>>> file directory between two RAID arrays with HDDs). I can handle the
> load in
> >>>> this configuration but the average CPU usage of the Cassandra nodes
is
> >>>> slightly above 50%. As I will need to add 12 more CFs (corresponding
> to
> >>>> another ~ 1 million variables) plus potentially other data later, it
> is
> >>>> clear that I need better hardware (also for the retrieval part).
> >>>>
> >>>> I am looking at Dell servers (Power Edge etc)
> >>>>
> >>>> Questions:
> >>>>
> >>>> 1. Is anyone using Dell HW for their Cassandra clusters? How do they
> >>>> behave? Anybody care to share their configurations or tips for
> buying, what
> >>>> to avoid etc?
> >>>>
> >>>> 2. Obviously I am going to keep to the advice on the
> >>>> http://wiki.apache.org/cassandra/CassandraHardware and split the
> commmitlog
> >>>> and data on separate disks. I was going to use SSD for commitlog but
> then
> >>>> did some more research and found out that it doesn't make sense to
> use SSDs
> >>>> for sequential appends because it won't have a performance advantage
> with
> >>>> respect to rotational media. So I am going to use rotational disk for
> the
> >>>> commit log and an SSD for data. Does this make sense?
> >>>>
> >>>> 3. What's the best way to find out how big my commitlog disk and my
> data
> >>>> disk has to be? The Cassandra hardware page says the Commitlog disk
> >>>> shouldn't be big but still I need to choose a size!
> >>>>
> >>>> 4. I also noticed RAID 0 configuration is recommended for the data
> file
> >>>> directory. Can anyone explain why?
> >>>>
> >>>> Sorry for the huge email.....
> >>>>
> >>>> Cheers,
> >>>> Alex
> >>>
> >>
> >
> >
>

Mime
View raw message