kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nichunen <...@apache.org>
Subject Re: question. How to speed up convert cuboid to HFILE step.
Date Wed, 10 Jul 2019 03:43:00 GMT
Hi Kang-sen,


Yes, sorry for my typo, I mean the mr configs.


The number of reduce tasks in step “convert cuboid to HFILE step” is close to the region
count of cube's hbase table. So I suggest you reduce the config of kylin.storage.hbase.region-cut-gb
to a smaller number, it can be a float number, I think this will increase the reduce tasks’
number for this step.






Best regards,

 

Ni Chunen / George



On 07/8/2019 21:51,Lu, Kang-Sen<klu@rbbn.com> wrote:

Hi, George:

 

Thanks for your reply.

 

I am not sure exactly how to change kylin config to improve the step converting cuboid to
HFILE. Do you mind point me to the document so that I know exact which parameter to modify.

 

In addition, do you really mean to adjust “kylin’s hive config”? The file for that should
be kylin_hive_conf.xml, not kylin_job_conf.xml. But I’d rather believe the answer is in
kylin_job_conf.xml, because it is likely mapreduce config that may help to improve the performance.

 

Kang-sen

 

From:zjsynce@163.com <zjsynce@163.com> On Behalf Of nichunen
Sent: Thursday, July 4, 2019 10:25 PM
To:user@kylin.apache.org
Subject: Re:question. How to speed up convert cuboid to HFILE step.

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi Kang-sen,

 

You can adjust the configuration in Kylin's hive configuration file ($KYLIN_HOME/conf/kylin_job_conf.xml)
 to speed up the MR jobs.

 

Best regards,

 

Ni Chunen / George

 

On 04/12/2019 04:59,Lu, Kang-Sen<klu@rbbn.com> wrote:

I am running kylin 2.5.1.

 

When I build one hour’s cuboids, the step converting cuboid to HFILE took 8.33 minutes.
Only 1 reduce task was created. Is there way to start more reduce tasks?

 

The following is my kylin.properties file content:

 

#kylin.storage.hbase.region-cut-gb=5

kylin.storage.hbase.hfile-size-gb=1

 

Any suggestion is welcome.

 

Thanks.

 

Kang-sen

 

Log from kylin monitor step 13: (The data size is 1.26GB)

 

Counters: 50

        File System Counters

               FILE: Number of bytes read=966603663

               FILE: Number of bytes written=1996309058

               FILE: Number of read operations=0

               FILE: Number of large read operations=0

               FILE: Number of write operations=0

               HDFS: Number of bytes read=662733811

               HDFS: Number of bytes written=1350608338

               HDFS: Number of read operations=199

               HDFS: Number of large read operations=0

               HDFS: Number of write operations=5

        Job Counters

               Launched map tasks=48

               Launched reduce tasks=1

               Data-local map tasks=45

               Rack-local map tasks=3

               Total time spent by all maps in occupied slots (ms)=799000

               Total time spent by all reduces in occupied slots (ms)=822064

               Total time spent by all map tasks (ms)=799000

               Total time spent by all reduce tasks (ms)=411032

               Total vcore-milliseconds taken by all map tasks=799000

               Total vcore-milliseconds taken by all reduce tasks=411032

               Total megabyte-milliseconds taken by all map tasks=8999936000

               Total megabyte-milliseconds taken by all reduce tasks=9259728896

        Map-Reduce Framework

               Map input records=28693452

               Map output records=57386904

               Map output bytes=6215636946

               Map output materialized bytes=1020541171

               Input split bytes=11748

               Combine input records=0

               Combine output records=0

               Reduce input groups=57386904

               Reduce shuffle bytes=1020541171

               Reduce input records=57386904

               Reduce output records=57386904

               Spilled Records=114773808

               Shuffled Maps =48

               Failed Shuffles=0

               Merged Map outputs=48

               GC time elapsed (ms)=47218

               CPU time spent (ms)=1278720

               Physical memory (bytes) snapshot=130269708288

               Virtual memory (bytes) snapshot=585895706624

               Total committed heap usage (bytes)=149828927488

        Shuffle Errors

               BAD_ID=0

               CONNECTION=0

               IO_ERROR=0

               WRONG_LENGTH=0

               WRONG_MAP=0

               WRONG_REDUCE=0

        File Input Format Counters

               Bytes Read=662722063

        File Output Format Counters

               Bytes Written=1350608338

 

 

Notice: This e-mail together with any attachments may contain information of Ribbon Communications
Inc. that is confidential and/or proprietary for the sole use of the intended recipient. Any
review, disclosure, reliance or distribution by others or forwarding without express permission
is strictly prohibited. If you are not the intended recipient, please notify the sender immediately
and then delete all copies, including any attachments.
Mime
View raw message