Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3355D200AE1 for ; Mon, 6 Jun 2016 13:31:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 31F52160A24; Mon, 6 Jun 2016 11:31:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2FBE2160A0E for ; Mon, 6 Jun 2016 13:31:17 +0200 (CEST) Received: (qmail 33104 invoked by uid 500); 6 Jun 2016 11:31:16 -0000 Mailing-List: contact user-help@kylin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kylin.apache.org Delivered-To: mailing list user@kylin.apache.org Received: (qmail 33092 invoked by uid 99); 6 Jun 2016 11:31:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jun 2016 11:31:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 8F43BC0EB4 for ; Mon, 6 Jun 2016 11:31:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id M_ghb6dvjbfA for ; Mon, 6 Jun 2016 11:31:14 +0000 (UTC) Received: from mail-vk0-f50.google.com (mail-vk0-f50.google.com [209.85.213.50]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 146E85FB46 for ; Mon, 6 Jun 2016 11:31:13 +0000 (UTC) Received: by mail-vk0-f50.google.com with SMTP id d64so71157627vkb.0 for ; Mon, 06 Jun 2016 04:31:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=1eJiyMN6bNilq7O+yKpqHgEUgpSdXKTcM4xC8r47Pz0=; b=eeDpyIrOK61RF8N5NpOOxg7m0vjY5nRqUuNPMq8TX+EN/Px5qIbEPzpJsN+i0pfyze Bxk/Gbv04DhIt8xSoSRheytbZMFR7ioaiUbm6WIu3znwWqkv+0WXYHQRQvNvwuj9WRI8 RGcoZsH0e/cxxvjn/a1xHrz7LyXfXOUwGrQvvefQhgYVWICDypsoaTmzFkDhbi2Xjn5X 3+8Xx7GGa9ohNoAi0oDRnzVgTxjSsgUNlzwRkGV8QR+TCPedP1T2Z5lc2/gT5+Ip9SqQ TaRcSEu6DrAEKGgKIU/PXmI3irPpGW0kfhG5ow1atwESXlt03Uulu8QG0CvXbFz0egD3 J/6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=1eJiyMN6bNilq7O+yKpqHgEUgpSdXKTcM4xC8r47Pz0=; b=K33f+YI7WDoMESv6BDr7cOPTosXbK1HhB9aqIpq5RWiLAD6IVvqhf+zyipyRBT3IYi QHgqJmtjpjggHe/Z6f6PiWGw6O8ShZBPRIgq1n7Fe3YThXhpf2EWxoIiyu7tJ3hkeGOe 9qo9bIvMfVcjxkTmID/7LAXkyNCaj0bPajGRg83xHVuNJ5O1LOoI9BXAHA7Pfc9MrC7j vVxapacSdV4k12fVlqfvw8JxbefhsC6EMAFIHHBM2wWiB0vYQxBjd7cqlEZdecSjs06B VxQ2Zny3UL1iKvz5cHR71ijv6iaGEuy9q/aFheff4i7yybi+1IbAp8Slb7UaRfFfmlrj a6jg== X-Gm-Message-State: ALyK8tKt2KOMk+ObAYbNAy3ibxgpTd3xDn8WAyV6cIle7xD7W1LxS9OTsEkVfUHNh7hr1HtpOVf919mb1jADZg== X-Received: by 10.176.5.233 with SMTP id e96mr6612462uae.3.1465212672916; Mon, 06 Jun 2016 04:31:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.7.70 with HTTP; Mon, 6 Jun 2016 04:30:53 -0700 (PDT) From: Joel Victor Date: Mon, 6 Jun 2016 17:00:53 +0530 Message-ID: Subject: Fact tables with complex data types. To: user@kylin.apache.org Content-Type: multipart/alternative; boundary=94eb2c123040e3c87805349a68f2 archived-at: Mon, 06 Jun 2016 11:31:18 -0000 --94eb2c123040e3c87805349a68f2 Content-Type: text/plain; charset=UTF-8 Hi, I am using Kylin 1.5.2 with HDP 2.2 Currently my fact table contains multiple columns with type array. Kylin won't allow me to sync this table since it has complex datatypes. I don't need these complex data types in my cube builds but I do require them for other jobs. The table is partitioned on date has 2 buckets and is stored in ORC format. I tried creating a view over it but it seems Kylin doesn't support views as a fact table. Another approach that I came up with is moving all the columns with complex data types from the original table to a separate table and use the original table as my fact table for building cubes. Is there any other way to go about this scenario ? I get the following error when I sync the view: java.lang.RuntimeException: java.io.IOException: java.lang.NullPointerException at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:86) at org.apache.kylin.source.hive.cardinality.HiveColumnCardinalityJob.run(HiveColumnCardinalityJob.java:89) at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:91) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:121) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:81) ... 10 more Caused by: java.lang.NullPointerException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:191) at org.apache.hive.hcatalog.mapreduce.FosterStorageHandler.(FosterStorageHandler.java:59) at org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:417) at org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:380) at org.apache.hive.hcatalog.mapreduce.InitializeInput.extractPartInfo(InitializeInput.java:158) at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:137) at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86) at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95) Thanks, Joel --94eb2c123040e3c87805349a68f2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I am using Kylin 1.5.2 with HDP 2.2
Currently my fact table contains multiple columns= with type array<string>.=C2=A0Kylin won't allow me to sync this table since it has c= omplex datatypes. I don't need these complex data types in my cube buil= ds but I do require them for other jobs.

The table is partitioned on date has 2 buckets an= d is stored in ORC format.

I tried creating a view over it but it seems Kylin doesn't = support views as a fact table.

Another approach that I came up with is moving all the columns with com= plex data types from the original table to a separate table and use the ori= ginal table as my fact table for building cubes.

Is there any other way to go about this s= cenario ?
=
I get= the following error when I sync the view:
java.lang.RuntimeException: java.io.IO= Exception: java.lang.NullPointerException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.ky= lin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.j= ava:86)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.kylin.source.hive.cardinality.HiveCo= lumnCardinalityJob.run(HiveColumnCardinalityJob.java:89)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 a= t org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:91)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapRedu= ceExecutable.java:121)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.kylin.job.execution.A= bstractExecutable.execute(AbstractExecutable.java:114)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at = org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChain= edExecutable.java:50)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.kylin.job.execution.Ab= stractExecutable.execute(AbstractExecutable.java:114)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at o= rg.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultS= cheduler.java:124)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.ThreadPoolExecu= tor.runWorker(ThreadPoolExecutor.java:1145)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.c= oncurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)=
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: j= ava.lang.NullPointerException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hive.hcatalog.= mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInpu= tFormat.java:51)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.kylin.source.hive.HiveMRInp= ut$HiveTableInputFormat.configureJob(HiveMRInput.java:81)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = ... 10 more
Caused by: java.lang.NullPointerException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Cl= ass.forName0(Native Method)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Class.forName(Cla= ss.java:191)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 a= t org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:= 417)
=C2=A0= =C2=A0 =C2=A0 =C2=A0 at org.apache.hive.hcatalog.common.HCatUtil.getStorag= eHandler(HCatUtil.java:380)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hive.hcatalog.ma= preduce.InitializeInput.extractPartInfo(InitializeInput.java:158)
=C2=A0 =C2=A0 =C2=A0= =C2=A0 at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobIn= fo(InitializeInput.java:137)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hive.hcatalog.m= apreduce.InitializeInput.setInput(InitializeInput.java:86)
=C2=A0 =C2=A0 =C2=A0 =C2=A0= at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFo= rmat.java:95)


Thanks,
= Joel
--94eb2c123040e3c87805349a68f2--