kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fei Yi <yijianhui...@gmail.com>
Subject Re: How to use MR to build UHC dimensions
Date Mon, 02 Apr 2018 03:13:05 GMT
Hi Billy,
I those a dimension with 60,000,000 data, measure is
count_distinct(order_id),
when i add the column "order_id" as global dictionary,web ui prompt created
successfully.
but global dictionary column are not displayed on the web ui ,and there are
no any errors in the log file.

Thanks for your help

this is the log:

2018-04-02 10:43:22,354 DEBUG [http-bio-7070-exec-6]
controller.CubeController:1010 : Saving cube {
  "name": "GLD_MR_TEST",
  "model_name": "M_ORDER",
  "description": "",
  "dimensions": [
    {
      "name": "CALENDAR_DATE",
      "table": "OD",
      "column": "CALENDAR_DATE",
      "normal": "true"
    },
    {
      "name": "YEAR_MONTH",
      "table": "OD",
      "column": "YEAR_MONTH",
      "normal": "true"
    }
  ],
  "measures": [
    {
      "name": "_COUNT_",
      "function": {
        "expression": "COUNT",
        "returntype": "bigint",
        "parameter": {
          "type": "constant",
          "value": "1"
        },
        "configuration": {}
      }
    },
    {
      "name": "CD",
      "function": {
        "expression": "COUNT_DISTINCT",
        "returntype": "bitmap",
        "parameter": {
          "type": "column",
          "value": "FACT_ORDER_DETAIL.ORDER_ID"
        }
      },
      "showDim": false
    }
  ],
  "dictionaries": [],
  "rowkey": {
    "rowkey_columns": [
      {
        "column": "OD.CALENDAR_DATE",
        "encoding": "dict",
        "isShardBy": "false",
        "encoding_version": 1
      },
      {
        "column": "OD.YEAR_MONTH",
        "encoding": "dict",
        "isShardBy": "false",
        "encoding_version": 1
      }
    ]
  },
  "aggregation_groups": [
    {
      "includes": [
        "OD.CALENDAR_DATE",
        "OD.YEAR_MONTH"
      ],
      "select_rule": {
        "hierarchy_dims": [],
        "mandatory_dims": [
          "OD.CALENDAR_DATE",
          "OD.YEAR_MONTH"
        ],
        "joint_dims": []
      }
    }
  ],
  "mandatory_dimension_set_list": [],
  "partition_date_start": 1514764800000,
  "notify_list": [],
  "hbase_mapping": {
    "column_family": [
      {
        "name": "F1",
        "columns": [
          {
            "qualifier": "M",
            "measure_refs": [
              "_COUNT_"
            ]
          }
        ]
      },
      {
        "name": "F2",
        "columns": [
          {
            "qualifier": "M",
            "measure_refs": [
              "CD"
            ]
          }
        ]
      }
    ]
  },
  "volatile_range": "0",
  "retention_range": "0",
  "status_need_notify": [
    "ERROR",
    "DISCARDED",
    "SUCCEED"
  ],
  "auto_merge_time_ranges": [],
  "engine_type": 2,
  "storage_type": "2",
  "override_kylin_properties": {}
}
2018-04-02 10:43:22,356 DEBUG [http-bio-7070-exec-6]
cachesync.CachedCrudAssist:190 : Saving CubeDesc at
/cube_desc/GLD_MR_TEST.json
2018-04-02 10:43:22,359 DEBUG [pool-6-thread-1] cachesync.Broadcaster:113 :
Servers in the cluster: [localhost:7070]
2018-04-02 10:43:22,359 DEBUG [pool-6-thread-1] cachesync.Broadcaster:123 :
Announcing new broadcast to all: BroadcastEvent{entity=cube_desc,
event=create, cacheKey=GLD_MR_TEST}
2018-04-02 10:43:22,361 DEBUG [http-bio-7070-exec-4]
cachesync.Broadcaster:247 : Broadcasting CREATE, cube_desc, GLD_MR_TEST
2018-04-02 10:43:22,361 INFO  [http-bio-7070-exec-6]
service.CubeService:211 : New cube GLD_MR_TEST has 1 cuboids
2018-04-02 10:43:22,362 INFO  [http-bio-7070-exec-6] cube.CubeManager:219 :
Creating cube 'dw_zyb-->GLD_MR_TEST' from desc 'GLD_MR_TEST'
2018-04-02 10:43:22,362 INFO  [http-bio-7070-exec-6] cube.CubeManager:297 :
Updating cube instance 'GLD_MR_TEST'
2018-04-02 10:43:22,362 DEBUG [http-bio-7070-exec-6]
cachesync.CachedCrudAssist:190 : Saving CubeInstance at
/cube/GLD_MR_TEST.json
2018-04-02 10:43:22,362 DEBUG [http-bio-7070-exec-4]
cachesync.Broadcaster:247 : Broadcasting UPDATE, project_schema, dw_zyb
2018-04-02 10:43:22,364 DEBUG [pool-6-thread-1] cachesync.Broadcaster:113 :
Servers in the cluster: [localhost:7070]
2018-04-02 10:43:22,364 DEBUG [pool-6-thread-1] cachesync.Broadcaster:123 :
Announcing new broadcast to all: BroadcastEvent{entity=cube, event=create,
cacheKey=GLD_MR_TEST}
2018-04-02 10:43:22,365 DEBUG [http-bio-7070-exec-6]
cachesync.CachedCrudAssist:190 : Saving ProjectInstance at
/project/dw_zyb.json
2018-04-02 10:43:22,367 DEBUG [pool-6-thread-1] cachesync.Broadcaster:113 :
Servers in the cluster: [localhost:7070]
2018-04-02 10:43:22,367 DEBUG [pool-6-thread-1] cachesync.Broadcaster:123 :
Announcing new broadcast to all: BroadcastEvent{entity=project,
event=update, cacheKey=dw_zyb}
2018-04-02 10:43:22,376 DEBUG [http-bio-7070-exec-4]
project.ProjectL2Cache:195 : Loading L2 project cache for dw_zyb
2018-04-02 10:43:22,376 WARN  [http-bio-7070-exec-4]
realization.RealizationRegistry:91 : No provider for realization type
INVERTED_INDEX
2018-04-02 10:43:22,378 INFO  [http-bio-7070-exec-4]
service.CacheService:120 : cleaning cache for project dw_zyb (currently
remove all entries)
2018-04-02 10:43:22,378 DEBUG [http-bio-7070-exec-4]
cachesync.Broadcaster:281 : Done broadcasting UPDATE, project_schema, dw_zyb
2018-04-02 10:43:22,378 DEBUG [http-bio-7070-exec-4]
cachesync.Broadcaster:281 : Done broadcasting CREATE, cube_desc, GLD_MR_TEST
2018-04-02 10:43:22,381 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:247 : Broadcasting CREATE, cube, GLD_MR_TEST
2018-04-02 10:43:22,383 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:247 : Broadcasting UPDATE, project_data, dw_zyb
2018-04-02 10:43:22,383 INFO  [http-bio-7070-exec-1]
service.CacheService:120 : cleaning cache for project dw_zyb (currently
remove all entries)
2018-04-02 10:43:22,383 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:281 : Done broadcasting UPDATE, project_data, dw_zyb
2018-04-02 10:43:22,383 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:281 : Done broadcasting CREATE, cube, GLD_MR_TEST
2018-04-02 10:43:22,386 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:247 : Broadcasting UPDATE, project, dw_zyb
2018-04-02 10:43:22,387 DEBUG [http-bio-7070-exec-1]
project.ProjectL2Cache:195 : Loading L2 project cache for dw_zyb
2018-04-02 10:43:22,387 WARN  [http-bio-7070-exec-1]
realization.RealizationRegistry:91 : No provider for realization type
INVERTED_INDEX
2018-04-02 10:43:22,387 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:247 : Broadcasting UPDATE, project_schema, dw_zyb
2018-04-02 10:43:22,402 DEBUG [http-bio-7070-exec-1]
project.ProjectL2Cache:195 : Loading L2 project cache for dw_zyb
2018-04-02 10:43:22,402 WARN  [http-bio-7070-exec-1]
realization.RealizationRegistry:91 : No provider for realization type
INVERTED_INDEX
2018-04-02 10:43:22,404 INFO  [http-bio-7070-exec-1]
service.CacheService:120 : cleaning cache for project dw_zyb (currently
remove all entries)
2018-04-02 10:43:22,404 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:281 : Done broadcasting UPDATE, project_schema, dw_zyb
2018-04-02 10:43:22,405 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:247 : Broadcasting UPDATE, project_data, dw_zyb
2018-04-02 10:43:22,405 INFO  [http-bio-7070-exec-1]
service.CacheService:120 : cleaning cache for project dw_zyb (currently
remove all entries)
2018-04-02 10:43:22,405 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:281 : Done broadcasting UPDATE, project_data, dw_zyb
2018-04-02 10:43:22,405 DEBUG [http-bio-7070-exec-1]
cachesync.Broadcaster:281 : Done broadcasting UPDATE, project, dw_zyb

2018-04-01 23:23 GMT+08:00 Billy Liu <billyliu@apache.org>:

> Hi Fei Yi,
>
> This parameter only works for ultra high cardinality columns,
> including the columns defined as "ShardBy" and "Global Dictionary".
> Please check if your cube has these two definitions.
>
> With Warm regards
>
> Billy Liu
>
>
> 2018-03-30 16:45 GMT+08:00 Fei Yi <yijianhui123@gmail.com>:
> > I use kylin 2.3.1 version´╝î
> > set kylin.engine.mr.build-uhc-dict-in-additional-step=true
> > kylin.snapshot.max-mb=3000
> >
> > but job are still built in kylin server, I don't see a separate step to
> > build UHC dimensions
> >
> >
>

Mime
View raw message