kylin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Han <luke...@gmail.com>
Subject Re: [Proposal] Kylin Cuboid Whitelist
Date Wed, 31 Dec 2014 14:21:18 GMT
Thanks Ted. Will try to commit some files there to check our accounts also.



2014-12-31 8:21 GMT+08:00 Ted Dunning <ted.dunning@gmail.com>:

> The repo is now created and lives at:
> https://git-wip-us.apache.org/repos/asf/incubator-kylin.git
>
>
>
> On Fri, Dec 26, 2014 at 1:48 AM, Luke Han <lukehan@apache.org> wrote:
>
> > We will leverage JIRA soon when Apache Git Repo ready.
> >
> > Will run script to import all existing Github Issues to JIRA.
> >
> > Thanks.
> >
> >
> > 2014-12-26 4:24 GMT+08:00 Ted Dunning <ted.dunning@gmail.com>:
> >
> > > JIRA is up now for KYLIN.
> > >
> > > I am not sure the progress on the git repo.
> > >
> > >
> > >
> > > On Wed, Dec 24, 2014 at 9:18 PM, Luke Han <lukehan@apache.org> wrote:
> > >
> > > > Github Issues for tracking:
> > > https://github.com/KylinOLAP/Kylin/issues/263
> > > >
> > > > 2014-12-25 13:13 GMT+08:00 Luke Han <lukehan@apache.org>:
> > > >
> > > > > Cool, that's what we need to enhance Kylin's storage and build
> > process.
> > > > >
> > > > > Will create Issues/JIRA to tracking this.
> > > > >
> > > > > Thank you very much.
> > > > >
> > > > > Luke
> > > > >
> > > > >
> > > > > 2014-12-24 14:14 GMT+08:00 Li Yang <liyang@apache.org>:
> > > > >
> > > > >> This is an very interesting idea. Actually many less general
> > solutions
> > > > >> (from talk to various people we met) took exactly this approach.
> > > > >>
> > > > >> This feature will benefit users who have their hadoop cluster
> hosted
> > > in
> > > > >> cloud service. Less cuboid means less CPU cycles, and that's
less
> to
> > > > pay.
> > > > >>
> > > > >> Yang
> > > > >>
> > > > >> On Wed, Dec 24, 2014 at 1:47 PM, hongbin ma <mahongbin@apache.org
> >
> > > > wrote:
> > > > >>
> > > > >> > Logically, a cube contains cuboids representing all combinations
> > of
> > > > >> > dimensions. Apparently, a naive cube building strategy that
> > > > materializes
> > > > >> > all cuboids will easily meet curse-of-dimension problems.
> > Currently
> > > > >> Kylin
> > > > >> > leverages a strategy called "aggregation groups" to reduce
the
> > > number
> > > > of
> > > > >> > cuboids need being materialized.
> > > > >> >
> > > > >> > However, if the query pattern is simple and fixed, the
> > "aggregation
> > > > >> group"
> > > > >> > strategy is still not efficient enough. For example, suppose
> > > there're
> > > > >> five
> > > > >> > dimensions, namely A,B,C,D and E. The data modeler is sure
that
> > only
> > > > >> > combinations (A,B,C), (D,E), (A,E) will be queried, so he’ll
use
> > the
> > > > >> > aggregation group tool to optimize his cube definition.
However,
> > > > >> whatever
> > > > >> > aggregation group he chooses, lots of useless combinations
would
> > be
> > > > >> > materialized.
> > > > >> >
> > > > >> > With a new strategy called "cuboid whitelist", data modelers
can
> > > guide
> > > > >> > Kylin to only materialize the cuboids he's interested in.
> > Depending
> > > on
> > > > >> the
> > > > >> > whitelist, Kylin will materialize the minimal set of cuboids
to
> > > cover
> > > > >> each
> > > > >> > cuboid in the whitelist. To support this, the following
> > > > functionalities
> > > > >> > should be added:
> > > > >> >
> > > > >> > 1. Front-end/UI for specifying whitelist members, and persistent
> > > them
> > > > to
> > > > >> > cube description.
> > > > >> > 2. Enhanced job engine scheduler that will calculate a minimal
> > > > spanning
> > > > >> > build tree based on the whitelist.
> > > > >> > 3. (OPTIONAL) Enhanced job engine to support dynamic whitelist,
> > > > trigger
> > > > >> new
> > > > >> > builds for lately added whitelist members.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > Hongbin Ma
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>



-- 

Best Regards!
---------------------

Luke Han

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message