kylin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Sinha <abhisheksinha1...@gmail.com>
Subject Re: Kylin code base help needed
Date Mon, 09 Mar 2015 06:57:43 GMT
Hi,

I went through these test cases but couldn't figure out much from it. Is
there any design document available that would describe the data
transformation from star schema initially(in hive) up to the final data in
HBase.

On Wed, Mar 4, 2015 at 11:00 AM, 宋轶 <yi.song@outlook.com> wrote:

> You can refer to these test cases to figure out the intermediate output of
> each steps
>
> https://github.com/KylinOLAP/Kylin/blob/master/job/src/test/java/org/apache/kylin/job/hadoop/cube/BaseCuboidMapperTest.javahttps://github.com/KylinOLAP/Kylin/blob/master/job/src/test/java/org/apache/kylin/job/hadoop/cube/NDCuboidMapperTest.javahttps://github.com/KylinOLAP/Kylin/blob/master/job/src/test/java/org/apache/kylin/job/hadoop/cube/CubeReducerTest.java
>
> > Date: Wed, 4 Mar 2015 00:14:10 +0530
> > Subject: Re: Kylin code base help needed
> > From: abhisheksinha1911@gmail.com
> > CC: dev@kylin.incubator.apache.org
> >
> > Need to figure out the output of every step in order to better understand
> > the cube building process. Any way to decode the hadoop mapreduce output
> > files?
> >
> > On Tue, Mar 3, 2015 at 2:41 PM, Luke Han <lukehan@apache.org> wrote:
> >
> > >     Kylin using dictionary to encode dimension values from String/Date
> to
> > > digital value only, which will reduce storage significantly.
> > >     In query phase, when Kylin got result, it will decode and return
> > > actually value to the client.
> > >
> > >     Yang could have more detail comments for this.
> > >
> > >     BTW, the intermedia files only be used by Kylin application, why
> you
> > > need to decode it?
> > >     Please feel free to let's know if you have more questions.
> > >
> > >     Thanks.
> > > Luke
> > >
> > >
> > >
> > >
> > > 2015-03-03 17:01 GMT+08:00 Luke Han <lukehan@apache.org>:
> > >
> > >> Forward to mailing list for further support.
> > >>
> > >>
> > >> ---------- Forwarded message ----------
> > >> From: Abhishek Sinha <abhisheksinha1911@gmail.com>
> > >> Date: 2015-02-22 20:20 GMT+08:00
> > >> Subject: Kylin code base help needed
> > >> To: lukehan@apache.org
> > >>
> > >>
> > >> Hey,
> > >> I was looking at the Kylin code base(master) in order to understand
> the
> > >> flow and output of each of the steps in cube building process.
> > >>
> > >> The first step which is "Create Intermediate hive table" can easily be
> > >> understood as the table is being created in Hive. However, further
> down the
> > >> line, "Build base cuboid" or the "N dimension cuboid" has its output
> being
> > >> created in a "tmp" folder in HDFS. I tried opening the 'part-r-00000'
> but
> > >> it seems that the output is encoded in some format(possibly byte
> array or
> > >> something).
> > >>
> > >> Can you give me a little bit idea about the encoding technique that is
> > >> being used, and possibly how to decode and get the intermediate
> outputs.
> > >>
> > >>
> > >>
> > >>
> > >> Thanks and regards,
> > >>
> > >>
> > >>
> > >> Abhishek Sinha
> > >>
> > >>
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message