Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 28C60200B87 for ; Mon, 19 Sep 2016 22:43:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 27583160ADC; Mon, 19 Sep 2016 20:43:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6D016160ABB for ; Mon, 19 Sep 2016 22:43:45 +0200 (CEST) Received: (qmail 14763 invoked by uid 500); 19 Sep 2016 20:43:44 -0000 Mailing-List: contact user-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kylin.apache.org Delivered-To: mailing list user@kylin.apache.org Received: (qmail 14753 invoked by uid 99); 19 Sep 2016 20:43:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Sep 2016 20:43:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 3F0E9180481 for ; Mon, 19 Sep 2016 20:43:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=infoworks-io.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pKJT9mQ4CSwJ for ; Mon, 19 Sep 2016 20:43:42 +0000 (UTC) Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 757E45F24B for ; Mon, 19 Sep 2016 20:43:41 +0000 (UTC) Received: by mail-wm0-f51.google.com with SMTP id w84so97181319wmg.1 for ; Mon, 19 Sep 2016 13:43:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infoworks-io.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=69bK6nUhpBEyrJJZXmU3Rkw6rSemgQMglFnXIV7Qdeg=; b=p636M3fiNReGTIJjxj/bRDC+KEZ/08o6P6Px/jpTHfzNSBn3yd+Og8Zpa6MjpeYpix Ip9vtZhU0dO2Xnv2pXLC59HPpdkJfmucJGCm/TidMWS9kzPgoBFWDnq6AZQumZ3r9jaF ZGmhUacEcmbQUKSuMq67ZcKqaPcPqRvR5cPiDXT1f189kG2UCsNbH4/4fYqGPVbEFeuq uzdEodyyZQLpIVbrWWa2zFDpTnhwGQVtrvmyvsurP1P2Ekv49ZiTNBvuYDP3clxQ6G/a VHisQxqFCu5HNuoZ3/3w1plGjoDrtfXs5Ag04+xGjGGh2I50lf4evR5mPyW9LZJ2MxaA ZH8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=69bK6nUhpBEyrJJZXmU3Rkw6rSemgQMglFnXIV7Qdeg=; b=MxHDINybT3IMCe+/QR/cPvuTev8dbXqmyLSZSQodruG2vtvm1aYr0gZnkG8olA+m9g kjgiFMjeqgu7/y0J/jwRBCqhM0tbYBnopsWSbw40s5l+TIS309nLS2rzd4cCw4syF6jV mYgNAfu61RhtNrWVvNZ8vfk2NBNLV2AG8UhDUANSzCGkWbx1otciJhBr7dzV+NPCaaWB aJOdJ4paja0K7VcwxACXEcCMgJFWbaqNrD4uIgVhRzpg+sj3u53PGVNWDkKs3hNs7s27 brpG2IVwzCTnVAqOX5iVGN22+do4Ro/XWk/dkrj0x5P/9I+1CNgwmDs+dOSCR4MCcOp9 YYOA== X-Gm-Message-State: AE9vXwO/KGVnjR8dRH1/jkY2MJ5laD8OWrYXIABdLHsfg87NbkzhU8kUOAJEs50UFbhRKD2H5vc5z9eibunM1A== X-Received: by 10.28.109.156 with SMTP id b28mr86873wmi.68.1474317820863; Mon, 19 Sep 2016 13:43:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.34.133 with HTTP; Mon, 19 Sep 2016 13:43:40 -0700 (PDT) In-Reply-To: References: From: Sandeep Khurana Date: Tue, 20 Sep 2016 02:13:40 +0530 Message-ID: Subject: Re: group by on varchar column To: user@kylin.apache.org Content-Type: multipart/alternative; boundary=001a11468b18ffe8e8053ce25d13 archived-at: Mon, 19 Sep 2016 20:43:46 -0000 --001a11468b18ffe8e8053ce25d13 Content-Type: text/plain; charset=UTF-8 btw, we are using Kylin version 1.5.2 On Tue, Sep 20, 2016 at 2:03 AM, Sandeep Khurana wrote: > Hello > > I have a query where i do group by on a varchar column. The column values > are long sentences (not just single words). This column is part of a > dimension table. > > When i select just from dimension table with this group by then I do > get ~2000 records . > > But when I join this dimension with the fact table and run the group by > query then I get just 1 record as Kylin somehow assumes the VARCHAR column > values as NULL. There is not even a single row which has value of this > VARCHAR field as null. > > Same query I copy paste and run on the Hive tables, I do get more than > thousand records. > > Strange thing is when I change the aggregate column to another VARCHAR > column (city_name) whose values are just one word and run on kylin SQL > editor then I do get proper records . > > 2 questions > > - Any idea why such behaviour ? Especially when Hive gives proper records > whereas kylin returns just one record which has value of this big varchar > field as NULL. > > - Is there any work around? > > --001a11468b18ffe8e8053ce25d13 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
btw, we are using Kylin version 1.5.2

On Tue, Sep 20, 2016 at 2:03 AM, Sande= ep Khurana <sandeep@infoworks.io> wrote:
Hello

I have a query = where i do group by on a varchar column. The column values are long sentenc= es (not just single words). This column is part of a dimension table.
=

When i select just from dimension table with this group= by then I do get=C2=A0~2000=C2=A0records =C2=A0 .

But when I join this dimension with the fact table and run the group by qu= ery then I get just 1 record as Kylin somehow assumes the VARCHAR column va= lues as NULL. There is not even a single row which has value of this VARCHA= R field as null.

Same query I copy paste = and run on the Hive tables, I do get more than thousand records.
=
Strange thing is when I change the aggregate column to anoth= er VARCHAR column (city_name) whose values are just one word and run on kyl= in SQL editor then I do get proper records .

2 que= stions

- Any idea why such behaviour ? Especially = when Hive gives proper records whereas kylin returns just one record which = has value of this big varchar field as NULL.

- Is = there any work around? =C2=A0



--001a11468b18ffe8e8053ce25d13--