kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clifford Resnick <cresn...@mediamath.com>
Subject Re: "broadcast" tablet replication for kudu?
Date Fri, 16 Mar 2018 17:55:46 GMT
The problem is, AFIK, that replication count is not necessarily the distribution count, so
you can't guarantee all tablet servers will have a copy.

On Mar 16, 2018 1:41 PM, Boris Tyukin <boris@boristyukin.com> wrote:
I'm new to Kudu but we are also going to use Impala mostly with Kudu. We have a few tables
that are small but used a lot. My plan is replicate them more than 3 times. When you create
a kudu table, you can specify number of replicated copies (3 by default) and I guess you can
put there a number, corresponding to your node count in cluster. The downside, you cannot
change that number unless you recreate a table.

On Fri, Mar 16, 2018 at 10:42 AM, Cliff Resnick <cresny@gmail.com<mailto:cresny@gmail.com>>
wrote:
We will soon be moving our analytics from AWS Redshift to Impala/Kudu. One Redshift feature
that we will miss is its ALL Distribution, where a copy of a table is maintained on each server.
We define a number of metadata tables this way since they are used in nearly every query.
We are considering using parquet in HDFS cache for these, and Kudu would be a much better
fit for the update semantics but we are worried about the additional contention.  I'm wondering
if having a Broadcast, or ALL, tablet replication might be an easy feature to add to Kudu?

-Cliff


Mime
View raw message