kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Yang <liy...@apache.org>
Subject Re: Discussion : About Time Consuming of Kylin Cubing
Date Thu, 31 Mar 2016 09:20:59 GMT
Please ignore the invertedindex. It's for experiment only and didn't
participate in build or query at the moment.

As to the scalability of MR job, it's more related to MR tuning techniques.
So basically you want enough mappers and reducers for parallelism, and
correct memory and VM parameters are for tasks. The expectation is scaling
linearly. If not such case, then analysis on the MR job should be taken.



On Thu, Mar 31, 2016 at 10:20 AM, Mars J <xujiao.mycafe@gmail.com> wrote:

> Hello All,
>
>       I think we can have a discussion on the time consuming during cube
> build steps.
>
>       Our team test kylin's performance to check whether kylin is suited
> to our requirements. Our environment is as follows,
>                  Hadoop 2.7.2 (file replication is 2)
>                  Hive 1.2
>                  HBase 1.1.3
>                  Kylin 1.3-HBase 1.1.3
>                  OS CentOS 6.7
>
>        We test kylin in 2 different ways seperately.
>                1,  dimensions from 4 to 10(increased by 2)
>                2,  cluster nodes from 3 to 5.
>
>        We have some interesting results to discuss
>                1,after extended nodes(No data balance), time consuming is
> obviously cutted at 10 dims and 12 dims, but have little change at 4/6/8
> dims.
>                2,after extended nodes(data balance done), time consuming
> is mostly the same to having no data balance, some times even more when
> dims is bigger(e.g. 12 dim).
>                3,Wether our test method is the right way ?
>
>
>        For these problems, We want to analysis it from source code. Due to
> my little experience in reading source code and the little comment in
> source code,  so here the discussion.
>
>        Starting from the source code engine-mr-steps......
>
>        By the way, what's puprpose of the invertedindex in Kylin ?
>

Mime
View raw message