kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Fouché <afou...@onfocus.io>
Subject Re: stripes, JBOD: Assignment and rebalance ?
Date Thu, 02 Mar 2017 17:38:27 GMT
Hi Dan
So will Kudu rebalance tablets on different disks dynamically when some
gets bigger than other, in order not to fill a disk while the other disk
space could remain mostly free ?

2017-03-02 18:26 GMT+01:00 Dan Burkert <danburkert@apache.org>:

> Hi Alexandre,
> responses inline
> On Thu, Mar 2, 2017 at 9:18 AM, Alexandre Fouché <afouche@onfocus.io>
> wrote:
>> When storing data on multiple JBOD disks, will Kudu assign data for
>> tablets efficiently as far as tablet sizes or activity are concerned, or
>> will it simply try to assign roughly the same number of tablets on each
>> disk, regarless of tablets true size or activity (we have many empty
>> tablets at this time). And will it rebalance tablets to one disk or another
>> automatically ? Or is it still better to expose one RAID0 volume ?
> Kudu will evenly spread data from all tablets across all disks.  This
> allows Kudu to get good write throughput and balancing, but similarly to
> RAID 0 it means that if one drive fails, all tablets on that tablet server
> will become unavailable. Kudu will automatically recover by re-replicating
> the tablets to a different tablet server as long as a majority of replicas
> are still available.  So, RAID 0 should provide no benefit for Kudu.  It's
> on the roadmap to make multi-disk configuration more flexible so that if a
> single disk dies only a subset of the tablets will become unavailable, but
> I don't have a timeline on that feature (no one is working on it to my
> knowledge).
>> Will using JBOD disks better than RAID stripes ? It seems from Bug
>> reports that when WAL disk fails, or one of the JBOD data disks, Kudu is
>> still unable to recover and keep or migrate good tablets. In that case, it
>> shows no improvement over a failed disk on a RAID0 where in both cases the
>> only recover option is to delete the whole Kudu data and WAL and let it
>> resync from other nodes ?
> I think what I wrote previously answers this, if not I can clarify.
> I found comments that WAL can only be one one disk, is it still the case,
>> or is this info obsolete ?
> This is currently the case. If you have many disks it's often advantageous
> to put the WAL on it's own disk (ideally an SSD if it's available). The WAL
> workload is more latency sensitive than the data workload.
> - Dan

View raw message