Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DACFF200B5C for ; Thu, 28 Jul 2016 00:05:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D9766160AA8; Wed, 27 Jul 2016 22:05:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 35B79160A90 for ; Thu, 28 Jul 2016 00:05:10 +0200 (CEST) Received: (qmail 2531 invoked by uid 500); 27 Jul 2016 22:05:09 -0000 Mailing-List: contact user-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kylin.apache.org Delivered-To: mailing list user@kylin.apache.org Received: (qmail 2520 invoked by uid 99); 27 Jul 2016 22:05:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2016 22:05:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id EA2001861E0 for ; Wed, 27 Jul 2016 22:05:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id JI_nwRBAXFrP for ; Wed, 27 Jul 2016 22:05:08 +0000 (UTC) Received: from mail-ua0-f181.google.com (mail-ua0-f181.google.com [209.85.217.181]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id B17C060E35 for ; Wed, 27 Jul 2016 22:05:07 +0000 (UTC) Received: by mail-ua0-f181.google.com with SMTP id l32so27185480ual.2 for ; Wed, 27 Jul 2016 15:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=D6giK5D/YlnkNZigs36nqlsRW+XHRv7qpY6SnYRnY5s=; b=TZAepYGvkYbpS1KYBtM5LeyBwyeMmpjw9m4dgt7maxAHp4+eeGj0OdNA3XWQ8kqbLu A6/4iRVOogrer8OkrL5VOsCv6zPGRvZevWqsywxLnESOd/9dmxjRyDhDSX803JrGi4xZ 2Fsl507E/qzSVvvMp04tYgSFUipHJDhaCfNV2+bREfXyMsL2BIvlAfmF/uIf6c3uJvUo F6moZwp0Z2B5BsHeMWXkVwF67EismdRGxGtID+i6aZ6IqETycEKV1OoO1BZ0SKW4wO8R 3qwhAERmCAQDxhbq8UIDQFfeN0ruHSH+YdjWArcTMaTZhxH4XUN80hqJoJbb1gFf3xJJ SPRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=D6giK5D/YlnkNZigs36nqlsRW+XHRv7qpY6SnYRnY5s=; b=O44/u8tv4i0VXafBUlJGs2M9Icq/1P0ltcYVNztYZG/doQrGA89WWGNMUjchFHpqeC h4EPNZUT6XlHmqLUa2ZOZb0Che0nZ/fOWH13FL3cryxIlhmR9pTv652niNADIz/vYeuS HKoTnZktZWKYX6dgD2SQxE8UK8mTR7EqNCjdJ0j8oiBqEeHiZX3rCV4Q3n6YggzXjyXO rFiNJPK184HqG/fJO1CJk2Xdx8mJjrZolzSdMhkO7VQ71lU2DgoqjMWG90mfqfAf8/tG uqrQOK+cNz2prMyq/QuMqKFE5TFX1ergBwd7Q1kpa6njfbp0hreMCU6kwvimdbvsZ2FI 8z5Q== X-Gm-Message-State: AEkoouuDKEAxsFlOzwEoxFHkfq70OEYdDXjqpq1v4ciNRewocYBFi0LZybYyaTsfCVkcOAQO/qoTwHfSug5mcg== X-Received: by 10.159.54.202 with SMTP id p68mr12000929uap.59.1469657101112; Wed, 27 Jul 2016 15:05:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.38.55 with HTTP; Wed, 27 Jul 2016 15:04:41 -0700 (PDT) From: Ruslan Dautkhanov Date: Wed, 27 Jul 2016 16:04:41 -0600 Message-ID: Subject: count distinct To: user@kylin.apache.org Content-Type: multipart/alternative; boundary=94eb2c042fc4741b000538a53589 archived-at: Wed, 27 Jul 2016 22:05:11 -0000 --94eb2c042fc4741b000538a53589 Content-Type: text/plain; charset=UTF-8 Hello, 1) How efficient is Kylin in materializing count distinct in its cubes? We're more intrested in exact count distinct. 2) How effiecient is Kylin for wide datasets? We have around 700 dimensions. Size of dataset - tens of billions records. Is it feasible to run such a workload on, for example, a 10-node Hadoop cluster? 3) (This is a less critical question than the first two ) Does Kylin has a session-level setting to switch between approx and exact count distinct? Like Impala has a session-level setting APPX_COUNT_DISTINCT So without changing application queries, users can switch if they're intrerested in approx or exact counts? Thank you, Ruslan Dautkhanov --94eb2c042fc4741b000538a53589 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
=
Hello,

1)
How effi= cient is Kylin in materializing count distinct in its cubes?
= We're more intrested in exact count distinct.

= 2) How effiecient is Kylin for wide datasets? We have around 700 dimensions= .
Size of dataset - tens of billions records.
Is it fea= sible to run such a workload on, for example, a 10-node Hadoop cluster?

3) =C2=A0(This is a less critical question than the f= irst two )=C2=A0
Does Kylin has a session-level setting to switch= between approx and exact=C2=A0
count distinct?
Like Im= pala has a session-level setting=C2=A0APPX_COUNT_DISTINCT=C2=A0
So without changing application queries, users can switch if they'= ;re intrerested=C2=A0
in approx or exact counts?


Thank you,
Ruslan Dautkhanov
--94eb2c042fc4741b000538a53589--