From James Briggs <>
Subject Re: JBOD disk failure - just say no
Date Tue, 21 Aug 2018 00:48:53 GMT
Cassandra JBOD has a bunch of issues, so I don't recommend it for production:
1) disks fill up with load (data) unevenly, meaning you can run out on a disk while some are
half-full2) one bad disk can take out the whole node3) instead of a small failure probability
on an LVM/RAID volume, with JBOD you end up near 100% chance of failure after 3 years or so.4)
generally you will not have enough warning of a looming failure with JBOD compared to LVM/RAID.
(Somecompanies take a week or two to replace a failed disk.)
JBOD is easy to setup, but hard to manage. Thanks, James.

      From: kurt greaves <>
 To: User <> 
 Sent: Friday, August 17, 2018 5:42 AM
 Subject: Re: JBOD disk failure
As far as I'm aware, yes. I recall hearing someone mention tying system tables to a particular
disk but at the moment that doesn't exist.
On Fri., 17 Aug. 2018, 01:04 Eric Evans, <> wrote:

On Wed, Aug 15, 2018 at 3:23 AM kurt greaves <> wrote:
> Yep. It might require a full node replace depending on what data is lost from the system
tables. In some cases you might be able to recover from partially lost system info, but it's
not a sure thing.

Ugh, does it really just boil down to what part of `system` happens to
be on the disk in question?  In my mind, that makes the only sane
operational procedure for a failed disk to be: "replace the entire
node".  IOW, I don't think we can realistically claim you can survive
a failed a JBOD device if it relies on happenstance.

> On Wed., 15 Aug. 2018, 17:55 Christian Lorenz, < >
>> Thank you for the answers. We are using the current version 3.11.3 So this one includes
>> So if I get this right, losing system tables will need a full node rebuild. Otherwise
repair will get the node consistent again.
> [ ... ]

Eric Evans

