cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Standefer <...@simplegeo.com>
Subject Re: Cassandra in the cloud
Date Thu, 03 Jun 2010 21:57:04 GMT
The commit log and data directory are on the same mounted directory
structure (the 2 RAID 0 striped ephemeral disks) rather than using 1
of the ephemeral disks for the data and 1 of the ephemeral disks for
the data directory.  While it's usually advised that for disk
utilization reasons you keep the commit logs and data directory on
separate disks, our RAID0 configuration gives us much more space for
the data directory without having to mess with EBSes.  We've found it
to be fine for now.

I see how my XFS snapshots reference was confusing.  Our plan is to
have a single AZ use EBSes for the data directory so that we can more
easily snapshot our data (trusting that our AZ-aware EndPointSnitch),
while other AZs will continue ephemeral drives.

-Ben Standefer


On Thu, Jun 3, 2010 at 1:26 PM, Mike Subelsky <mike@subelsky.com> wrote:
> Ben,
>
> do you just keep the commit log on the ephemeral drive?  Or data and
> commit? (I was confused by your reference to XFS and snapshots -- I
> assume you keep data on the XFS drive)
>
> -Mike
>
> On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer <ben@simplegeo.com> wrote:
>> We're using Cassandra on AWS at SimpleGeo.  We software RAID 0 stripe
>> the ephemeral drives to achieve better I/O and have machines in
>> multiple Availability Zones with a custom EndPointSnitch that
>> replicates the data between AZs for high availability (to be
>> open-sourced/contributed at some point).
>>
>> Using XFS as described here
>> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1663
>> also makes it very easy to snapshot your cluster to S3.
>>
>> We've had no real problems with EC2 and Cassandra, it's been great.
>>
>> -Ben Standefer
>>
>>
>> On Thu, Jun 3, 2010 at 11:51 AM, Eric Evans <eevans@rackspace.com> wrote:
>>> On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote:
>>>> We want to try out Cassandra in the cloud. Any recommendations?
>>>> Comments?
>>>>
>>>> Should we use Amazon? Rackspace? Something else?
>>>
>>> I personally haven't used Cassandra on EC2, but others have reported
>>> significantly better disk IO, (and hence, better performance), with
>>> Rackspace's Cloud Servers.
>>>
>>> Full disclosure though, I work for Rackspace. :)
>>>
>>> --
>>> Eric Evans
>>> eevans@rackspace.com
>>>
>>>
>>
>
>
>
> --
> Mike Subelsky
> oib.com // ignitebaltimore.com // subelsky.com
> @subelsky // (410) 929-4022
>

Mime
View raw message