Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 94ED2200B61 for ; Tue, 9 Aug 2016 09:22:52 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 938CE160AA5; Tue, 9 Aug 2016 07:22:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3FC15160AA4 for ; Tue, 9 Aug 2016 09:22:51 +0200 (CEST) Received: (qmail 70441 invoked by uid 500); 9 Aug 2016 07:22:50 -0000 Mailing-List: contact user-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kylin.apache.org Delivered-To: mailing list user@kylin.apache.org Received: (qmail 70431 invoked by uid 99); 9 Aug 2016 07:22:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2016 07:22:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E76B0C3BC2 for ; Tue, 9 Aug 2016 07:22:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.189 X-Spam-Level: * X-Spam-Status: No, score=1.189 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Zmi5B6nFuiZB for ; Tue, 9 Aug 2016 07:22:45 +0000 (UTC) Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 2FE385F1D5 for ; Tue, 9 Aug 2016 07:22:45 +0000 (UTC) Received: by mail-wm0-f54.google.com with SMTP id i5so14847791wmg.0 for ; Tue, 09 Aug 2016 00:22:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=tOx7uco/jOCwFgqM67THdgeTtIVNxBlwop/4LrWeL9U=; b=BSPmophmyIL1nEZMEhm2QPvu0zFwgaZOhwFoxtwVD+N3XYs4P8AWPY9YIxaDM0Ww+A fm4da45RDPe08nx9qUYE0OAKZlMD+m/Yh/L2c402ynf1AMoxyH44bavFbDBxGQzJy8F6 Ozz94AdTYBAVffeKAKpQlDlsZmPrSCdBt5UqgW8RSAWWPvwNWzJm3nP/mmPfggF//AcA Rg8dVnJn9/FV6eeAXbi3Oah5CEyVL6iH/8QuoAb29aMTurO/ujAHJeTWUGcq+TOOsEM5 OWQxXdRmfaXeY3NPnERs4QbRUPoEOWybBoq7TWVz8ljalCbaT5ePOoFoHsDXtMQJawM+ WucA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=tOx7uco/jOCwFgqM67THdgeTtIVNxBlwop/4LrWeL9U=; b=hgrVT+Tgc6SwP0hSpnH470uZJzELkakhsLQZkHIXcZPRgEGMIcAmuDwU4RS7+CfFq+ OznmFj4cd68BOhe/NKUc1WGdY3yGVf7pnjmHFWKq8nzfrh6AvLaEKAyGNLsnn6itOuL/ 8nAAhmf0CrUoGJT7L4W43wVZX/ma0XhI/ZYSun9amW/7Qxj9vFb01a2WBT12utMTL73N Em9vs57p2GdjRkX1mu/Vhh+9f91ep1EipJTjdUu4M1uA7UzENv4RyvC/JnaQv8qTnTI4 29+HFqG0RNA+hmNI/H8gtM2uSdJjq4l1w3J/VfPlEsLjwfhZ/YWTnjYRFyDwwHMJcGQ8 tIwA== X-Gm-Message-State: AEkoousATTpNcYCM4Ip4MvGsmkMLxns2vEwE8KfVAQY0a1HIyoSCZJ1f4dNX4+PWfruqhYptZSIHQQ/WKhS2Iw== X-Received: by 10.194.156.195 with SMTP id wg3mr28601480wjb.152.1470727364799; Tue, 09 Aug 2016 00:22:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.10.71 with HTTP; Tue, 9 Aug 2016 00:22:43 -0700 (PDT) In-Reply-To: References: <2994F254A682DA469C4A75BDABFF915451DFAD@IT-EXMB-16-29.meizu.com> <2994F254A682DA469C4A75BDABFF915452795A@IT-EXMB-16-29.meizu.com> <2994F254A682DA469C4A75BDABFF9154527A7A@IT-EXMB-16-29.meizu.com> From: =?UTF-8?Q?Alberto_Ram=C3=B3n?= Date: Tue, 9 Aug 2016 09:22:43 +0200 Message-ID: Subject: Re: Re: does kylin support top-N on a count or count distinct measure? To: ShaoFeng Shi Cc: user Content-Type: multipart/alternative; boundary=089e0122e9be23f3f705399e66b8 archived-at: Tue, 09 Aug 2016 07:22:52 -0000 --089e0122e9be23f3f705399e66b8 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, Top-N is usefull for one 'Top 10', but can be useful know the Sum of 'the Others' (=3D sum (top > 10)) Example: In a Shop Top 10 Sellers, sold 1.2M $ How much sold 'the others'? 1.2M $ its a lot respect the others? I know that this is not easy to implement, but if someboy have any idea ... 2016-08-09 6:19 GMT+02:00 ShaoFeng Shi : > Hi Tiansheng, > > The less post-aggregation, the better query performance; So for a specifi= c > query, if the "signle groupby column topN" need further aggregation to g= et > the final result, but "multiple groupby column topN" doesn't, then the > later one would have better performance. > > I didn't compare that, just personal cents; Welcome to do benchmark and > share with the community :-) > > 2016-08-09 11:54 GMT+08:00 =E5=BC=A0=E5=A4=A9=E7=94=9F : > >> I have a question: whether multiple column groupby is better performance >> than a single column groupby in topN measure. As i known it all can agg >> other dimensions. >> Whether it there was performance optimization in mulitple column groupby >> in topN measure. >> >> ShaoFeng Shi =E4=BA=8E2016=E5=B9=B48=E6=9C=888= =E6=97=A5=E5=91=A8=E4=B8=80 =E4=B8=8B=E5=8D=886:20=E5=86=99=E9=81=93=EF=BC= =9A >> >>> Alberto is correct; SUM(1) and multiple columns are implemented in Kyli= n >>> core, but from UI you couldn't define that; You need manually edit meta= data >>> for that. >>> >>> 2016-08-08 18:02 GMT+08:00 =E8=B5=B5=E5=A4=A9=E7=83=81 : >>> >>>> ok,I'll have a try >>>> >>>> ------------------------------ >>>> >>>> =E8=B5=B5=E5=A4=A9=E7=83=81 >>>> >>>> Kevin Zhao >>>> >>>> *zhaotianshuo@meizu.com * >>>> >>>> >>>> >>>> =E7=8F=A0=E6=B5=B7=E5=B8=82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=9C= =89=E9=99=90=E5=85=AC=E5=8F=B8 >>>> >>>> MEIZU Technology Co., Ltd. >>>> >>>> =E5=B9=BF=E4=B8=9C=E7=9C=81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6=8A= =80=E5=88=9B=E6=96=B0=E6=B5=B7=E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80= =E6=A5=BC >>>> >>>> MEIZU Tech Bldg., Technology & Innovation Coast >>>> >>>> Zhuhai, 519085, Guangdong, China >>>> >>>> meizu.com >>>> >>>> >>>> *From:* Alberto Ram=C3=B3n >>>> *Date:* 2016-08-08 17:59 >>>> *To:* user@kylin.apache.org >>>> *CC:* ShaoFeng Shi >>>> *Subject:* Re: Re: does kylin support top-N on a count or count >>>> distinct measure? >>>> In teorical en v1.5.3, you can Group by 'n' columns: >>>> https://issues.apache.org/jira/browse/KYLIN-1693 >>>> >>>> I don't tested 1.5.3 yet, and I don't know if has been implemented in >>>> UI Kylin, perhaps you can add this columns to JSON manually :) >>>> >>>> BR, Alberto >>>> >>>> 2016-08-08 11:37 GMT+02:00 =E8=B5=B5=E5=A4=A9=E7=83=81 : >>>> >>>>> SUM(1)? you mean just left ORDER|SUM by Column empty? ,then another >>>>> prob is I can't configure more than one group by column to it,how to = walk >>>>> around that? >>>>> >>>>> ------------------------------ >>>>> >>>>> =E8=B5=B5=E5=A4=A9=E7=83=81 >>>>> >>>>> Kevin Zhao >>>>> >>>>> *zhaotianshuo@meizu.com * >>>>> >>>>> >>>>> >>>>> =E7=8F=A0=E6=B5=B7=E5=B8=82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=9C= =89=E9=99=90=E5=85=AC=E5=8F=B8 >>>>> >>>>> MEIZU Technology Co., Ltd. >>>>> >>>>> =E5=B9=BF=E4=B8=9C=E7=9C=81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6=8A= =80=E5=88=9B=E6=96=B0=E6=B5=B7=E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80= =E6=A5=BC >>>>> >>>>> MEIZU Tech Bldg., Technology & Innovation Coast >>>>> >>>>> Zhuhai, 519085, Guangdong, China >>>>> >>>>> meizu.com >>>>> >>>>> >>>>> *From:* ShaoFeng Shi >>>>> *Date:* 2016-08-08 11:32 >>>>> *To:* user >>>>> *Subject:* Re: does kylin support top-N on a count or count distinct >>>>> measure? >>>>> For sorting on count, you can use SUM(1) as the expression; >>>>> >>>>> For sorting on other measure, it is on roadmap: https://issues.apache >>>>> .org/jira/browse/KYLIN-1377 >>>>> >>>>> We welcome the community to contribute on such enhancements, anyone >>>>> want to have a try? >>>>> >>>>> 2016-08-05 15:24 GMT+08:00 =E8=B5=B5=E5=A4=A9=E7=83=81 : >>>>> >>>>>> right now top-N measure need to specify a sum column, >>>>>> does kylin support top-N on a count or count distinct measure? >>>>>> >>>>>> ------------------------------ >>>>>> >>>>>> =E8=B5=B5=E5=A4=A9=E7=83=81 >>>>>> >>>>>> Kevin Zhao >>>>>> >>>>>> *zhaotianshuo@meizu.com * >>>>>> >>>>>> >>>>>> >>>>>> =E7=8F=A0=E6=B5=B7=E5=B8=82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6= =9C=89=E9=99=90=E5=85=AC=E5=8F=B8 >>>>>> >>>>>> MEIZU Technology Co., Ltd. >>>>>> >>>>>> =E5=B9=BF=E4=B8=9C=E7=9C=81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6= =8A=80=E5=88=9B=E6=96=B0=E6=B5=B7=E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A= =80=E6=A5=BC >>>>>> >>>>>> MEIZU Tech Bldg., Technology & Innovation Coast >>>>>> >>>>>> Zhuhai, 519085, Guangdong, China >>>>>> >>>>>> meizu.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> >>>>> Shaofeng Shi >>>>> >>>>> >>>> >>> >>> >>> -- >>> Best regards, >>> >>> Shaofeng Shi >>> >>> > > > -- > Best regards, > > Shaofeng Shi > > --089e0122e9be23f3f705399e66b8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

Top-N is useful= l for one 'Top 10', but can be useful know the Sum of 'the Othe= rs' (=3D sum (top > 10))

Example:
=C2=A0 In a= Shop Top 10 Sellers,=C2=A0 sold 1.2M $
=C2=A0 How much sold '= the others'?=C2=A0 1.2M $ its a lot respect the others?

I = know that this is not easy to implement, but if someboy have any idea ...

2016-08-0= 9 6:19 GMT+02:00 ShaoFeng Shi <shaofengshi@apache.org>:=
Hi Tiansheng,

<= /div>
The less post-aggregation, the better query performance; So for a= specific query, if the "signle=C2=A0groupby =C2=A0column topN" n= eed further aggregation to get the final result, but "multiple groupby= column topN" doesn't, then the later one would have better perfor= mance.=C2=A0

I didn't compare that, just personal ce= nts; Welcome to do benchmark and share with the community :-)

=
2016-08-09 11:54 GMT+08:00 =E5=BC=A0=E5=A4=A9=E7= =94=9F <zhtsh.lichao@gmail.com>:
I have a question: whether=C2=A0multiple col= umn groupby=C2=A0is better performance than=C2=A0a single column groupby=C2= =A0in topN measure. As i known it all can agg other dimensions.
Whether= it there was performance optimization in mulitple column groupby in topN m= easure.

ShaoFeng= Shi <shaofe= ngshi@apache.org>=E4=BA=8E2016=E5=B9=B48=E6=9C=888=E6=97=A5=E5= =91=A8=E4=B8=80 =E4=B8=8B=E5=8D=886:20=E5=86=99=E9=81=93=EF=BC=9A
=
Alberto is correc= t; SUM(1) and multiple columns are implemented in Kylin core, but from UI y= ou couldn't define that; You need manually edit metadata for that.=C2= =A0

2016-08-= 08 18:02 GMT+08:00 =E8=B5=B5=E5=A4=A9=E7=83=81 <zhaotianshuo@meizu.co= m>:
ok,I'll have a try


=E8=B5=B5=E5=A4=A9=E7=83=81

Kevin Zhao

zhaotianshuo@meizu.com<= /span>

=C2=A0

=E7=8F=A0=E6=B5=B7=E5=B8= =82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=9C=89=E9=99=90=E5=85=AC=E5=8F=B8=

MEIZU Technology Co., Ltd.

=E5=B9=BF=E4=B8=9C=E7=9C= =81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6=8A=80=E5=88=9B=E6=96=B0=E6=B5=B7= =E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=A5=BC=

MEIZU Tech Bldg., Technology & Innovation Coast

Zhuhai, 519085, Guangdong, China

meizu.com

=C2=A0
Alberto Ram=C3=B3n
Date:=C2=A02016-08-08=C2=A017:59
CC:=C2=A0ShaoFeng Shi
Subject:=C2=A0Re: Re: does kylin support top-N on a count or co= unt distinct measure?
In teorical en v1.5.3, you can Group by 'n' columns:
https://issues.apache.org/jira/browse/KYLIN-1693

I don't tested 1.5.3 yet, and I don't know if has been implemented = in UI Kylin, perhaps you can add this columns to JSON manually=C2=A0 :)

BR, Alberto

2016-08-08 11:37 GMT+02:00 =E8=B5=B5=E5=A4=A9=E7= =83=81 <zhaotianshuo@meizu.com>:
SUM(1)? you mean just left=C2=A0ORDER|SUM=C2=A0by=C2=A0Colu= mn empty? ,then another prob is I can't configure more than one group b= y column to it,how to walk around that?


=E8=B5=B5=E5=A4=A9=E7=83=81

Kevin Zhao

zhaotianshuo@meizu.com<= /span>

=C2=A0

=E7=8F=A0=E6=B5=B7=E5=B8= =82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=9C=89=E9=99=90=E5=85=AC=E5=8F=B8=

MEIZU Technology Co., Ltd.

=E5=B9=BF=E4=B8=9C=E7=9C= =81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6=8A=80=E5=88=9B=E6=96=B0=E6=B5=B7= =E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=A5=BC=

MEIZU Tech Bldg., Technology & Innovation = Coast

Zhuhai, 519085, Guangdong, China

meizu.com

=C2=A0
ShaoFeng Shi
Date:=C2=A02016-08-08=C2=A011:32
To:=C2=A0user
Subject:=C2=A0Re: does kylin support top-N on a count or count = distinct measure?
For sorting on count, you can use SUM(1) as the expression= ;

For sorting on other measure, it is on roadmap:=C2=A0https://issue= s.apache.org/jira/browse/KYLIN-1377

We welcome the community to contribute on such enhancements, anyone wa= nt to have a try?

2016-08-05 15:24 GMT+08:00 =E8=B5=B5=E5=A4=A9=E7= =83=81 <zhaotianshuo@meizu.com>:
right now top-N measure need to specify a sum column,=C2=A0= does=C2=A0kylin=C2=A0support=C2=A0top-N=C2=A0on=C2=A0a=C2=A0cou= nt=C2=A0or=C2=A0count=C2=A0distinct=C2=A0measure?=C2=A0


=E8=B5=B5=E5=A4=A9=E7=83=81

Kevin Zhao

zhaotianshuo@meizu.com<= /span>

=C2=A0

=E7=8F=A0=E6=B5=B7=E5=B8= =82=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=9C=89=E9=99=90=E5=85=AC=E5=8F=B8=

MEIZU Technology Co., Ltd.

=E5=B9=BF=E4=B8=9C=E7=9C= =81=E7=8F=A0=E6=B5=B7=E5=B8=82=E7=A7=91=E6=8A=80=E5=88=9B=E6=96=B0=E6=B5=B7= =E5=B2=B8=E9=AD=85=E6=97=8F=E7=A7=91=E6=8A=80=E6=A5=BC=

MEIZU Tech Bldg., Technology & Innovation = Coast

Zhuhai, 519085, Guangdong, China

meizu.com




--
Best regards,

Shaofeng Shi





--
Best regards,

Shaofeng Shi




--
Best regards,

Shaofeng Shi


--089e0122e9be23f3f705399e66b8--