kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adar Lieber-Dembo <a...@cloudera.com>
Subject Re: Kudu deployment best practice
Date Tue, 10 Apr 2018 18:13:48 GMT
On Tue, Apr 10, 2018 at 3:11 AM, Ksenya Leonova <ks.a.leonova@gmail.com> wrote:
> 1) Best practice in Kudu deployment:
> it is planned to use Kudu in conjunction with HDFS, so how do you usually
> solve the problem of sharing and flexible resource management between Kudu
> and HDFS?

At least for now, resource management is outside the purview of Kudu,
and probably HDFS too, though I don't know for certain. I know of some
deployments that use Linux cgroups for RM; maybe that will work for
you? Or if you've already got a resource management application
deployed such as Kubernetes or YARN, you could use that.

> 2) Mixed workload processing (OLAP & OLTP):
> in the case of mixed type of load has anybody faced with the problem of
> performance decrease? (slowdown of disk I/O and etc)
> We have 20 nodes in cluster. Each node has 10 x 8 TB 7,200 RPM High Capacity
> SAS Drives for data storage (HDFS now) and 2x SSD for OS.

Can you be more specific? What exactly do you mean by a "mixed type of load"?

In any case, Kudu write performance does degrade as tablets increase
in size, because to apply a row operation Kudu must check if that
row's primary key exists, and when your working set exceeds the size
of the Kudu block cache and the host's page cache, bloom and index
lookups for a primary key are likely to cause cache misses and require
additional disk access.

View raw message