Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E6BBDD6E6 for ; Wed, 31 Oct 2012 00:12:10 +0000 (UTC) Received: (qmail 80821 invoked by uid 500); 31 Oct 2012 00:12:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 80792 invoked by uid 500); 31 Oct 2012 00:12:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 80768 invoked by uid 99); 31 Oct 2012 00:12:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Oct 2012 00:12:04 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ranuser99@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Oct 2012 00:11:58 +0000 Received: by mail-vc0-f172.google.com with SMTP id fl11so1039734vcb.31 for ; Tue, 30 Oct 2012 17:11:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=b+14x8awwcMA7JgwmSdp6p2npgFOC5qzrlnMFG5IxH8=; b=QP1lapSRwe3JHg+MCI3NGiCKebd9utkkUoyKyG+8uECTEcP3Ouj/pmCS67RoLBHt/y swjD0NQy5ZYqtFOL0E6clQdFv/RTBP5OZE8k2TjlizguFCc01/fU1KzNIaCj+geSgyWT NDOTUoUOeY8TuF737ZpVXOsEnQ57GPy+53Pw/bucYaw7hrMflL/y5rw9MCZtUGlcFiUR pyJHleP1fX0YsAkIHAQmJzlHZ4wISEtetnANHWQrKB2ZinfOCwO1QPfxuIFQQiymPfuf aAupCDHya2G1SYOedB2x8KtaVMN7Hc1xhxh1CvvRv7c2XAPEH+C4M+NlP1ix+N+Jjrcq n01w== MIME-Version: 1.0 Received: by 10.59.10.38 with SMTP id dx6mr26132580ved.40.1351642297275; Tue, 30 Oct 2012 17:11:37 -0700 (PDT) Received: by 10.58.171.196 with HTTP; Tue, 30 Oct 2012 17:11:37 -0700 (PDT) In-Reply-To: <4E38D29B-A8CE-4A5A-928E-592B7123A845@thelastpickle.com> References: <0B2BF1E8E35731438C02772C683FB67B448CD7BC@AMXPRD0610MB353.eurprd06.prod.outlook.com> <4E38D29B-A8CE-4A5A-928E-592B7123A845@thelastpickle.com> Date: Wed, 31 Oct 2012 00:11:37 +0000 Message-ID: Subject: Re: idea drive layout - 4 drives + RAID question From: Ran User To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bdc9ae8fe16cf04cd4fbec6 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc9ae8fe16cf04cd4fbec6 Content-Type: text/plain; charset=ISO-8859-1 Is there a concern of a large falloff in commit log write performance (sequential) when sharing 2 drives (RAID 1) with the OS (os and services writing their own logs, etc)? Do you expect the hit to be marginal? On Tue, Oct 30, 2012 at 7:58 PM, aaron morton wrote: > We also have 4-disk nodes, and we use the following layout:**** > 2 x OS + Commit in RAID 1**** > 2 x Data disk in RAID 0 > > +1 > > You are replicating data at the application level and want the fastest > possible IO performance per node. > > You can already distribute the > individual Cassandra column families on different drives by just > setting up symlinks to the individual folders. > > There are some features coming in 1.2 that make using a JBOD setup easier. > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 30/10/2012, at 9:23 PM, Pieter Callewaert < > pieter.callewaert@be-mobile.be> wrote: > > We also have 4-disk nodes, and we use the following layout:**** > 2 x OS + Commit in RAID 1**** > 2 x Data disk in RAID 0**** > > This gives us the advantage we never have to reinstall the node when a > drive crashes.**** > > Kind regards,**** > Pieter**** > > > *From:* Ran User [mailto:ranuser99@gmail.com] > *Sent:* dinsdag 30 oktober 2012 4:33 > *To:* user@cassandra.apache.org > *Subject:* Re: idea drive layout - 4 drives + RAID question**** > > Have you considered running RAID 10 for the data drives to improve MTBF? > **** > **** > On one hand Cassandra is handling redundancy issues, on the other > hand, reducing the frequency of dealing with failed nodes > is attractive if cheap (switching RAID levels to 10). **** > **** > > We have no experience with software RAID (have always used hardware raid > with BBU). I'm assuming software RAID 1 or 10 (the mirroring part) is > inherently reliable (perhaps minus some edge case).**** > On Tue, Oct 30, 2012 at 1:07 AM, Tupshin Harper > wrote:**** > > I would generally recommend 1 drive for OS and commit log and 3 drive raid > 0 for data. The raid does give you good performance benefit, and it can be > convenient to have the OS on a side drive for configuration ease and better > MTBF.**** > > -Tupshin**** > On Oct 29, 2012 8:56 PM, "Ran User" wrote:**** > I was hoping to achieve approx. 2x IO (write and read) performance via > RAID 0 (by accepting a higher MTBF).**** > **** > Do believe the performance gains of RAID0 are much lower and/or are not > worth it vs the increased server failure rate?**** > **** > From my understanding, RAID 10 would achieve the read performance benefits > of RAID 0, but not the write benefits. I'm also considering RAID 10 to > maximize server IO performance. **** > **** > Currently, we're working with 1 CF.**** > **** > **** > > Thank you**** > On Mon, Oct 29, 2012 at 11:51 PM, Timmy Turner > wrote:**** > I'm not sure whether the raid 0 gets you anything other than headaches > should one of the drives fail. You can already distribute the > individual Cassandra column families on different drives by just > setting up symlinks to the individual folders. > > 2012/10/30 Ran User :**** > > For a server with 4 drive slots only, I'm thinking: > > > > either: > > > > - OS (1 drive) > > - Commit Log (1 drive) > > - Data (2 drives, software raid 0) > > > > vs > > > > - OS + Data (3 drives, software raid 0) > > - Commit Log (1 drive) > > > > or something else? > > > > also, if I can spare the wasted storage, would RAID 10 for cassandra data > > improve read performance and have no effect on write performance? > > > > Thank you!**** > ** ** > > > --047d7bdc9ae8fe16cf04cd4fbec6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Is there a concern of a large falloff in commit log write performance = (sequential) when sharing 2 drives (RAID 1) with the OS (os and services wr= iting their own logs, etc)?=A0 Do you expect the hit to be marginal?

=A0
On Tue, Oct 30, 2012 at 7:58 P= M, aaron morton <aaron@thelastpickle.com> wrote:
We also have 4-disk nodes, and we use the foll= owing layout:
2 x OS + Commit in RAID 1=
2 x Data disk in RAID 0
+1

You are replicating da= ta at the application level and want the fastest possible IO performance pe= r node.=A0

=A0You can already distribute the
individual Cassandra column families o= n different drives by just
setting up symlinks to the individual folders= .
There are some features coming in 1.2 that make using a= JBOD setup easier.=A0

Cheers

-----------------
Aaron Morton
Freelance Deve= loper
@aaronmorton

On 30/10/2012, at 9:23 PM, Pieter Callewaert <pieter.callewaert@b= e-mobile.be> wrote:

We also have 4-disk nodes, = and we use the following layout:
2 x OS + Commit in RAID 1=
2 x Data disk in RAID 0
=A0
This gives us the advantage we never have to reinsta= ll the node when a drive crashes.
=A0
Kind regards,
Pieter
=A0
=A0
From:=A0Ran User [mailto:<= a href=3D"mailto:ranuser99@" target=3D"_blank">ranuser99@gmail.com]=A0
Sent:=A0dinsdag 30 oktober 2012 4:33
To:=A0
u= ser@cassandra.apache.org
Subject:=A0Re: idea dri= ve layout - 4 drives + RAID question
=A0
Have you considered running RAID 10 for the data drives to improve MTBF?=A0= =A0
=A0
On one hand Cassandra is handling redundancy issue= s,=A0on the=A0other hand,=A0reducing the frequency of dealing with failed n= odes is=A0attractive=A0if=A0cheap (switching=A0RAID levels to 10).=A0
=A0

We have no experience with software RAID (have always used hardware raid wi= th BBU).=A0 I'm assuming software RAID 1 or 10 (the mirroring part) is = inherently reliable (perhaps minus some edge case).

On Tue, Oct 30, 2012 at 1:07 AM, Tupshin Harper &l= t;tupshin@tupshin.com> wrote:<= u>

I would generally recommend 1 drive for OS = and commit log and 3 drive raid 0 for data. The raid does give you good per= formance benefit, and it can be convenient to have the OS on a side drive f= or configuration ease and better MTBF.

-Tup= shin

On Oct 29, 2012 8:56 PM, "Ran User" <ranuser99@gmail.com> wrote:
I was hoping to achieve approx. 2x IO (write and read) performance via RAID= 0 (by accepting a higher MTBF).
=A0
Do believe the perfor= mance gains of RAID0 are much lower and/or are not worth it vs the increase= d server failure rate?
=A0
>From my understanding, RAID 10 would achieve the read performance benefits = of RAID 0, but not the write benefits.=A0 I'm also considering RAID 10 = to maximize server IO performance.=A0
=A0
Currently, we're = working with 1 CF.
=A0
=A0

Thank you

On Mon, Oct 29, 2= 012 at 11:51 PM, Timmy Turner <timm.turn@= gmail.com> wrote:
I'm not sure whether the raid 0 gets you anything o= ther than headaches
should one of the drives fail. You can already distr= ibute the
individual Cassandra column families on different drives by just
setting= up symlinks to the individual folders.

2012/10/30 Ran User <ranuser99@gmail.com>:
> For a server with 4 drive slots only, I'm= thinking:
>
> either:
>
> - OS (1 drive)
> -= Commit Log (1 drive)
> - Data (2 drives, software raid 0)
>
> vs
>
> = - OS =A0+ Data (3 drives, software raid 0)
> - Commit Log (1 drive)>
> or something else?
>
> also, if I can spare the = wasted storage, would RAID 10 for cassandra data
> improve read performance and have no effect on write performance?
&= gt;
> Thank you!
=A0


<= /div>
--047d7bdc9ae8fe16cf04cd4fbec6--