From user-return-1305-archive-asf-public=cust-asf.ponee.io@kudu.apache.org Fri Mar 16 19:51:51 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 4D884180608 for ; Fri, 16 Mar 2018 19:51:50 +0100 (CET) Received: (qmail 75581 invoked by uid 500); 16 Mar 2018 18:51:49 -0000 Mailing-List: contact user-help@kudu.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kudu.apache.org Delivered-To: mailing list user@kudu.apache.org Received: (qmail 75560 invoked by uid 99); 16 Mar 2018 18:51:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Mar 2018 18:51:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 4C7C3C004A for ; Fri, 16 Mar 2018 18:51:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=cloudera.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Bbf6BTkgYl1H for ; Fri, 16 Mar 2018 18:51:45 +0000 (UTC) Received: from mail-ua0-f175.google.com (mail-ua0-f175.google.com [209.85.217.175]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 1B6695F184 for ; Fri, 16 Mar 2018 18:51:45 +0000 (UTC) Received: by mail-ua0-f175.google.com with SMTP id v9so5036985uaj.3 for ; Fri, 16 Mar 2018 11:51:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudera.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=OvnP++DE9JeGTemLvGVA5tfaFAWde70YN24l8qxYpaE=; b=Zv/wCbEOoBAyu0cxlASX82zLoQTyf5Hs6alUm4aAxa2vZgwZ4ey2bqoXA10/9Sb1ck mx/lWjjfwfZw+GmWC5S1yr+7K6cvm92VHM2jtnhP3fb8cxHSh9576AjzZk8XxmqzVLSn EYlxb8aNEKzUzD7OK1ZSPuiIOG50fVzujQGWC1qbPJwgIZVvC1+Yby2p6duxcbt2WLtb ks/Ha9Tpwf8eaGOVQ9hPJkoa0ilRJ5zxEGD1dC4ePrdKn0qajQ0yu39mtaK0zX68KsW0 BuDPEOgwxkaGJ5BQms91mhqTAZ9//NKSZQCnPN5x3EkpIO61zZmJXEJSfYO82LK2eZJa 2ELg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=OvnP++DE9JeGTemLvGVA5tfaFAWde70YN24l8qxYpaE=; b=eulTWTGs2E97EOiQ9UuHUXkliC8RUM56KMwgLIG1NfBE6UnADs89in19LpIA+Bx2Q9 xe84InNDHOnw8RJxhiyYC9+OOxRYH4oTbNATQxXoBnsTIyLQ7Kr+P24PkzwcB1OjpGEL fP+e8URCFWnlJzM7iyEj6kJ8gTksl2lCB2VwUWiHvCvlqF8D6I4d7W7U+WCg6ZJtLvcg m5xXTMUcNtxyHcsPvNzdH+gnijxUk1tT4/G8ALIy7CJ8EpiCzxJovZfF7CZ1ai+2RVDX +UIu5ruHllKdpDcC05/o69bQZ9VnmxUM0u1++PGlAu4z2g3yy2hfHDnLfpv6hp8lwhft FAlw== X-Gm-Message-State: AElRT7G6sNDNloAuZ25YxL7RpU9rHQ0THEaddBBHx5towEQijV5qSphB siIh7ZILGwt8fi2BLv1vwZ5Ed0HJNgEFvbI+2kI6FmY8 X-Google-Smtp-Source: AG47ELugHja0mennb8aA0a0CmindwryFiax8eZ4UILc3qfY2sJARMVnEA3tYvrLE1KDVdtJdq1s5mGcEwmklwT7qPgU= X-Received: by 10.176.65.226 with SMTP id 89mr257074uap.9.1521226301800; Fri, 16 Mar 2018 11:51:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.55.163 with HTTP; Fri, 16 Mar 2018 11:51:20 -0700 (PDT) In-Reply-To: <1BEF1E38-2AA2-4CF0-B9B5-CD138C166E80@mediamath.com> References: <1BEF1E38-2AA2-4CF0-B9B5-CD138C166E80@mediamath.com> From: Todd Lipcon Date: Fri, 16 Mar 2018 11:51:20 -0700 Message-ID: Subject: Re: "broadcast" tablet replication for kudu? To: user@kudu.apache.org Content-Type: multipart/alternative; boundary="001a114bcb5c57ab6205678c1990" --001a114bcb5c57ab6205678c1990 Content-Type: text/plain; charset="UTF-8" It's worth noting that, even if your table is replicated, Impala's planner is unaware of this fact and it will give the same plan regardless. That is to say, rather than every node scanning its local copy, instead a single node will perform the whole scan (assuming it's a small table) and broadcast it from there within the scope of a single query. So, I don't think you'll see any performance improvements on Impala queries by attempting something like an extremely high replication count. I could see bumping the replication count to 5 for these tables since the extra storage cost is low and it will ensure higher availability of the important central tables, but I'd be surprised if there is any measurable perf impact. -Todd On Fri, Mar 16, 2018 at 11:35 AM, Clifford Resnick wrote: > Thanks for that, glad I was wrong there! Aside from replication > considerations, is it also recommended the number of tablet servers be odd? > > I will check forums as you suggested, but from what I read after searching > is that Impala relies on user configured caching strategies using HDFS > cache. The workload for these tables is very light write, maybe a dozen or > so records per hour across 6 or 7 tables. The size of the tables ranges > from thousands to low millions of rows so so sub-partitioning would not be > required. So perhaps this is not a typical use-case but I think it could > work quite well with kudu. > > From: Dan Burkert > Reply-To: "user@kudu.apache.org" > Date: Friday, March 16, 2018 at 2:09 PM > To: "user@kudu.apache.org" > Subject: Re: "broadcast" tablet replication for kudu? > > The replication count is the number of tablet servers which Kudu will host > copies on. So if you set the replication level to 5, Kudu will put the > data on 5 separate tablet servers. There's no built-in broadcast table > feature; upping the replication factor is the closest thing. A couple of > things to keep in mind: > > - Always use an odd replication count. This is important due to how the > Raft algorithm works. Recent versions of Kudu won't even let you specify > an even number without flipping some flags. > - We don't test much much beyond 5 replicas. It *should* work, but you > may run in to issues since it's a relatively rare configuration. With a > heavy write workload and many replicas you are even more likely to > encounter issues. > > It's also worth checking in an Impala forum whether it has features that > make joins against small broadcast tables better? Perhaps Impala can cache > small tables locally when doing joins. > > - Dan > > On Fri, Mar 16, 2018 at 10:55 AM, Clifford Resnick > wrote: > >> The problem is, AFIK, that replication count is not necessarily the >> distribution count, so you can't guarantee all tablet servers will have a >> copy. >> >> On Mar 16, 2018 1:41 PM, Boris Tyukin wrote: >> I'm new to Kudu but we are also going to use Impala mostly with Kudu. We >> have a few tables that are small but used a lot. My plan is replicate them >> more than 3 times. When you create a kudu table, you can specify number of >> replicated copies (3 by default) and I guess you can put there a number, >> corresponding to your node count in cluster. The downside, you cannot >> change that number unless you recreate a table. >> >> On Fri, Mar 16, 2018 at 10:42 AM, Cliff Resnick wrote: >> >>> We will soon be moving our analytics from AWS Redshift to Impala/Kudu. >>> One Redshift feature that we will miss is its ALL Distribution, where a >>> copy of a table is maintained on each server. We define a number of >>> metadata tables this way since they are used in nearly every query. We are >>> considering using parquet in HDFS cache for these, and Kudu would be a much >>> better fit for the update semantics but we are worried about the additional >>> contention. I'm wondering if having a Broadcast, or ALL, tablet >>> replication might be an easy feature to add to Kudu? >>> >>> -Cliff >>> >> >> > -- Todd Lipcon Software Engineer, Cloudera --001a114bcb5c57ab6205678c1990 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
It's worth noting that, even if your table is replicat= ed, Impala's planner is unaware of this fact and it will give the same = plan regardless. That is to say, rather than every node scanning its local = copy, instead a single node will perform the whole scan (assuming it's = a small table) and broadcast it from there within the scope of a single que= ry. So, I don't think you'll see any performance improvements on Im= pala queries by attempting something like an extremely high replication cou= nt.

I could see bumping the replication count to 5 for t= hese tables since the extra storage cost is low and it will ensure higher a= vailability of the important central tables, but I'd be surprised if th= ere is any measurable perf impact.

-Todd
=

On Fri, Mar= 16, 2018 at 11:35 AM, Clifford Resnick <cresnick@mediamath.com&g= t; wrote:
Thanks for that, glad I was wrong there! Aside from replication consid= erations, is it also recommended the number of tablet servers be odd?

I will check forums as you suggested, but from what I read after searc= hing is that Impala relies on user configured caching strategies using HDFS= cache.=C2=A0 The workload for these tables is very light write, maybe a do= zen or so records per hour across 6 or 7 tables. The size of the tables ranges from thousands to low millions of = rows so so sub-partitioning would not be required. So perhaps this is not a= typical use-case but I think it could work quite well with kudu.

From: Dan Burkert <danburkert@apache.org>=
Reply-To: "user@kudu.apache.org" <<= a href=3D"mailto:user@kudu.apache.org" target=3D"_blank">user@kudu.apache.o= rg>
Date: Friday, March 16, 2018 at 2:0= 9 PM
To: "user@kudu.apache.org" <user@kudu.apache.org= >
Subject: Re: "broadcast" = tablet replication for kudu?

The replication count is the number of tablet servers which Kudu will = host copies on.=C2=A0 So if you set the replication level to 5, Kudu will p= ut the data on 5 separate tablet servers.=C2=A0 There's no built-in bro= adcast table feature; upping the replication factor is the closest thing.=C2=A0 A couple of things to keep in mind:

- Always use an odd replication count.=C2=A0 This is important due to how t= he Raft algorithm works.=C2=A0 Recent versions of Kudu won't even let y= ou specify an even number without flipping some flags.
- We don't test much much beyond 5 replicas.=C2=A0 It should wor= k, but you may run in to issues since it's a relatively rare configurat= ion.=C2=A0 With a heavy write workload and many replicas you are even more = likely to encounter issues.

It's also worth checking in an Impala forum whether it has feature= s that make joins against small broadcast tables better?=C2=A0 Perhaps Impa= la can cache small tables locally when doing joins.

- Dan



--
Todd Lipcon
Soft= ware Engineer, Cloudera
--001a114bcb5c57ab6205678c1990--