kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hit-lacus <>
Subject 回复:Re: Problem with Cube
Date Sun, 23 Jun 2019 05:16:14 GMT
   It looks like it is caused by data skew, which offten happen in many big data scene. As
far as I know, I think you should check the high cardinality colmun and use it as a "Shard
By" column (in "Advanced Setting" of cube design stage). You may check "Redistribute intermediate
table" in for more information.
   If you find anything wrong or I misunderstand anything, please let me know. Thank you.

Best wishes to you ! 
From :Xiaoxiang Yu

At 2019-06-22 02:33:56, "Cinto Sunny" <> wrote:

Thanks. We actually have 12 reducers. The problem is that one reducer is getting stuck with
huge data. The rest completes. We have a 1.8 billion dsids and not sure if that is problem.
If yes, how do we distribute the data

- Cinto

On Fri, Jun 21, 2019 at 12:03 AM Chao Long <> wrote:

Hi Cinto Sunny,
   You can try to set "" a bigger value, default is 1.

On Fri, Jun 21, 2019 at 2:44 PM Cinto Sunny <> wrote:

Hi All,

I am building a cube with 10 dimensions and two measures. The total input size is 100 GB.

I am trying to build using Roaring BitMap. One of the fact is user and has ~1.8B userids.

The build is getting stuck at stage - Extract Fact Table Distinct Columns. One executor is
stuck and is processing over 800M lines.

I am using version - 2.6.

Any pointers would be appreciated. Let me know is any further information is required.

- Cinto
View raw message