kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil Parmar <sunilosu...@gmail.com>
Subject co-locating kudu table servers with HDFS data nodes
Date Wed, 22 Nov 2017 00:17:53 GMT
We are using CDH 5.12 and using HDFS for our primary data storage and
Impala for querying them. Our worker node hosts both HDFS datanode and
Impalad services. We're starting to move some of our data into KUDU and
would like to understand community experiment and recommendation on
disk/machine allocation and pro/cons for each.

Install KUDU tablet server on each worker node vs separate machine
Separate physical disks for KUDU tablet server on same machine vs sharing
the disk with data nodes
SSD vs spinning disks

Some more questions on separate note but kinda related to the POC
We have a small table as a first candidate for KUDU ( couple of G before
replication ) . Does KUDU tries to distribute data across tablet servers
for each table i.e. slow performance with too much sparse data. i.e. for
small table what is better fewer disk partitions ( host-partition ) vs
evenly distributed across worker nodes.

Sunil Parmar

View raw message