kylin-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hongbin ma <mahong...@apache.org>
Subject Re: Kylin Cube Performance
Date Thu, 04 Aug 2016 15:31:46 GMT
Hi Jason

As Shaofeng explained it's not reasonable to expect sub-second latency if
you're returning tens of millions of records. You data model is quite
simple and you don't have costly measure like distinct count etc., so kylin
should be performant on normal OLAP queries.

Another advise: if the cardinality of the mandatory dimensions (CHILD_ID
and SITE_ID) are very high, you might isolate such dimensions into a
separate "aggregation group", so that 1. queries not touching these
dimensions can be performant 2. calculate less cuboids. Please refer to
http://kylin.apache.org/blog/2016/02/18/new-aggregation-group/

On Thu, Aug 4, 2016 at 11:11 PM, ShaoFeng Shi <shaofengshi@apache.org>
wrote:

> The log is pretty clear; the cuboid is exact match, but the scan count is
> massive:
>
> Visiting hbase table KYLIN_RIK9O18H07: cuboid exact match, from 992 to 992
> Total scan count: 12306477
>
> Please add where condition to narrow down the result set as much as
> possible; It doesn't make sense for an OLAP query to return millions of
> records;
>
> 2016-08-04 13:05 GMT+08:00 Jason Hale <jason@koddi.com>:
>
> > Sure, see kylin.log below:
> >
> > 2016-08-04 00:47:35,839 INFO  [http-bio-7070-exec-7]
> > controller.QueryController:175 : The original query:  SELECT SUM(clicks)
> > FROM hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id,
> > report_date
> > 2016-08-04 00:47:35,839 INFO  [http-bio-7070-exec-7]
> > service.QueryService:266 : The corrected query: SELECT SUM(clicks) FROM
> > hpa_reporting2 GROUP BY site_id, child_id, search_type, hotel_id,
> > report_date
> > LIMIT 50000
> > 2016-08-04 00:47:35,908 INFO  [http-bio-7070-exec-7]
> routing.QueryRouter:48
> > : The project manager's reference is
> > org.apache.kylin.metadata.project.ProjectManager@3a3735a5
> > 2016-08-04 00:47:35,909 INFO  [http-bio-7070-exec-7]
> routing.QueryRouter:60
> > : Find candidates by table DEFAULT.HPA_REPORTING2 and project=KODDI_DEV :
> > org.apache.kylin.query.routing.Candidate@51ed1b3b
> > 2016-08-04 00:47:35,909 INFO  [http-bio-7070-exec-7]
> routing.QueryRouter:49
> > : Applying rule: class
> > org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule,
> > realizations before: [hpa_reporting2_cube_clone(CUBE)], realizations
> > after:
> > [hpa_reporting2_cube_clone(CUBE)]
> > 2016-08-04 00:47:35,910 INFO  [http-bio-7070-exec-7]
> routing.QueryRouter:49
> > : Applying rule: class
> > org.apache.kylin.query.routing.rules.RealizationSortRule, realizations
> > before: [hpa_reporting2_cube_clone(CUBE)], realizations after:
> > [hpa_reporting2_cube_clone(CUBE)]
> > 2016-08-04 00:47:35,910 INFO  [http-bio-7070-exec-7]
> routing.QueryRouter:72
> > : The realizations remaining: [hpa_reporting2_cube_clone(CUBE)] And the
> > final chosen one is the first one
> > 2016-08-04 00:47:35,975 DEBUG [http-bio-7070-exec-7]
> > enumerator.OLAPEnumerator:107 : query storage...
> > 2016-08-04 00:47:35,976 INFO  [http-bio-7070-exec-7]
> > v2.CubeStorageQuery:239 : exactAggregation is true
> > 2016-08-04 00:47:35,976 INFO  [http-bio-7070-exec-7]
> > v2.CubeStorageQuery:357 : Enable limit 50000
> > 2016-08-04 00:47:35,977 DEBUG [http-bio-7070-exec-7]
> > v2.CubeHBaseEndpointRPC:257 : New scanner for current segment
> > hpa_reporting2_cube_clone[19700101000000_20160828000000] will use
> > SCAN_FILTER_AGGR_CHECKMEM as endpoint's behavior
> > 2016-08-04 00:47:35,979 DEBUG [http-bio-7070-exec-7]
> > v2.CubeHBaseEndpointRPC:313 : Serialized scanRequestBytes 836 bytes,
> > rawScanBytesString 56 bytes
> > 2016-08-04 00:47:35,979 INFO  [http-bio-7070-exec-7]
> > v2.CubeHBaseEndpointRPC:315 : The scan 31b2dd4c for segment
> > hpa_reporting2_cube_clone[19700101000000_20160828000000] is as below with
> > 1
> > separate raw scans, shard part of start/end key is set to 0
> > 2016-08-04 00:47:35,980 INFO  [http-bio-7070-exec-7] v2.CubeHBaseRPC:271
> :
> > Visiting hbase table KYLIN_RIK9O18H07: cuboid exact match, from 992 to
> 992
> > Start:
> > \x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\
> > x00\x00\x00\x00\x00
> > (\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\x00\x00\x00\x00\
> > x00\x00\x00\x00\x00)
> > Stop:
> >  \x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\
> > xFF\xFF\xFF\xFF\xFF\x00
> > (\x00\x00\x00\x00\x00\x00\x00\x00\x03\xE0\xFF\xFF\xFF\xFF\
> > xFF\xFF\xFF\xFF\xFF\x00),
> > No Fuzzy Key
> > 2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7]
> > v2.CubeHBaseEndpointRPC:320 : Submitting rpc to 1 shards starting from
> > shard 2, scan range count 1
> > 2016-08-04 00:47:35,981 INFO  [http-bio-7070-exec-7]
> > v2.CubeHBaseEndpointRPC:103 : Timeout for ExpectedSizeIterator is: 99000
> > 2016-08-04 00:47:35,981 DEBUG [http-bio-7070-exec-7]
> > enumerator.OLAPEnumerator:127 : return TupleIterator...
> > 2016-08-04 00:47:52,773 INFO  [pool-6-thread-1]
> v2.CubeHBaseEndpointRPC:351
> > : <sub-thread for GTScanRequest 31b2dd4c> Endpoint RPC returned from
> HTable
> > KYLIN_RIK9O18H07 Shard
> > \x4B\x59\x4C\x49\x4E\x5F\x52\x49\x4B\x39\x4F\x31\x38\x48\
> > x30\x37\x2C\x00\x02\x2C\x31\x34\x37\x30\x31\x35\x35\x33\
> > x31\x34\x39\x33\x37\x2E\x61\x33\x61\x35\x34\x37\x39\x61\
> > x32\x63\x37\x61\x61\x64\x30\x36\x33\x66\x30\x33\x64\x63\
> > x34\x65\x31\x30\x36\x33\x61\x33\x61\x37\x2E
> > on host: ip-10-0-0-157.ec2.internal.Total scanned row: 12306477. Total
> > filtered/aggred row: 0. Time elapsed in EP: 16562(ms). Server CPU usage:
> > 0.24348086721950246, server physical mem left: 7.195234304E9, server swap
> > mem left:0.0.Etc message: start latency: 15@1,agg done@13760,compress
> > done@16562,server stats done@16562,
> > debugGitTag:cf4d2940b67d622eacd2ac9a913b221091a35c2e;.Normal Complete:
> > true.
> > 2016-08-04 00:47:54,068 DEBUG [pool-6-thread-1] util.CompressionUtils:67
> :
> > Original: 46465726 bytes. Decompressed: 150553629 bytes. Time: 1294
> > 2016-08-04 00:48:29,303 INFO  [pool-4-thread-1]
> > threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual
> > running,
> > 0 ready, 12 others
> > 2016-08-04 00:48:31,990 INFO  [http-bio-7070-exec-7]
> > service.QueryService:399 : Scan count for each storageContext: 12306477,
> > 2016-08-04 00:48:31,991 INFO  [http-bio-7070-exec-7]
> > controller.QueryController:197 : Stats of SQL response: isException:
> false,
> > duration: 56152, total scan count 12306477
> > 2016-08-04 00:48:32,000 WARN  [http-bio-7070-exec-7]
> > sizeof.ObjectGraphWalker:209 : The configured limit of 1,000 object
> > references was reached while attempting to calculate the size of the
> object
> > graph. Severe performance degradation could occur if the sizing operation
> > continues. This can be avoided by setting the CacheManger or Cache
> > <sizeOfPolicy> elements maxDepthExceededBehavior to "abort" or adding
> stop
> > points with @IgnoreSizeOf annotations. If performance degradation is NOT
> an
> > issue at the configured limit, raise the limit value using the
> CacheManager
> > or Cache <sizeOfPolicy> elements maxDepth attribute. For more
> information,
> > see the Ehcache configuration documentation.
> > 2016-08-04 00:48:32,091 INFO  [http-bio-7070-exec-7]
> > service.QueryService:250 :
> > ==========================[QUERY]===============================
> > SQL: SELECT SUM(clicks) FROM hpa_reporting2 GROUP BY site_id, child_id,
> > search_type, hotel_id, report_date
> > User: ADMIN
> > Success: true
> > Duration: 56.152
> > Project: koddi_dev
> > Realization Names: [hpa_reporting2_cube_clone]
> > Cuboid Ids: [992]
> > Total scan count: 12306477
> > Result row count: 50000
> > Accept Partial: true
> > Is Partial Result: false
> > Hit Exception Cache: false
> > Storage cache used: false
> > Message: null
> > ==========================[QUERY]===============================
> >
> > On Wed, Aug 3, 2016 at 8:38 PM, ShaoFeng Shi <shaofengshi@apache.org>
> > wrote:
> >
> > > Hi Jason, could you please provide the full log since sending query to
> > and
> > > getting result back? The key information is which cuboid is used for
> the
> > > query, cuboid exact match or fuzzy match, how many records be scanned
> and
> > > how long it tooks; Thanks.
> > >
> > > 2016-08-03 23:19 GMT+08:00 Jason Hale <jason@koddi.com>:
> > >
> > > > Yes, it would have to do post-aggregation in that case, but the
> strange
> > > > thing is that query was running fast (about 1 second), while queries
> > with
> > > > more dimensions, such as "SELECT SUM(clicks) FROM reporting GROUP BY
> > > > site_id, child_id, report_date, hotel_id". This query will take about
> > 106
> > > > seconds, but it shouldn't need to do any post-aggregation so I would
> > > think
> > > > it should return much quicker than that from the respective cuboid.
> > > >
> > > > Here's the explain plan:
> > > > OLAPToEnumerableConverter
> > > > OLAPProjectRel(EXPR$0=[$4])
> > > > OLAPAggregateRel(group=[{0, 1, 2, 3}], EXPR$0=[SUM($4)])
> > > > OLAPProjectRel(SITE_ID=[$9], CHILD_ID=[$3], REPORT_DATE=[$0],
> > > > HOTEL_ID=[$2], CLICKS=[$10])
> > > > OLAPTableScan(table=[[DEFAULT, HPA_REPORTING2]], fields=[[0, 1, 2, 3,
> > 4,
> > > 5,
> > > > 6, 7, 8, 9, 10, 11]])
> > > >
> > > > On Tue, Aug 2, 2016 at 7:46 PM, ShaoFeng Shi <shaofengshi@apache.org
> >
> > > > wrote:
> > > >
> > > > > In the cube definition, you defined "SITE_ID", "CHILD_ID" as
> > > "Mandatory"
> > > > > dimension, which means they will not be aggregated in cube build
> > phase
> > > > for
> > > > > all combinations.
> > > > >
> > > > > So when you run a query like  "SELECT SUM(clicks) FROM reporting
> > GROUP
> > > BY
> > > > > search_type", Kylin will use the combination  "SITE_ID" +
> "CHILD_ID"
> > +
> > > > > "SEARCH_TYPE" to serve, there will be post-aggregation in runtime;
> > The
> > > > > performance is much depent on the cardinality of "SITE_ID" and
> > > > "CHILD_ID".
> > > > >
> > > > >
> > > > > 2016-08-02 23:08 GMT+08:00 Jason Hale <jason@koddi.com>:
> > > > >
> > > > > > I've looked over the optimization options before, but did not
> > notice
> > > > the
> > > > > > rowkey ordering. I can try this and see if this helps me. This
is
> > the
> > > > > only
> > > > > > thing I see that I can attempt to optimize further in the design,
> > but
> > > > > I'll
> > > > > > provide my cube design below. I only have one measure to keep
it
> > > > simple:
> > > > > >
> > > > > > {
> > > > > >   "uuid": "4090b854-8f0c-4288-bd73-fc50238a6030",
> > > > > >   "version": "1.5.2",
> > > > > >   "name": "hpa_reporting2_cube",
> > > > > >   "description": "",
> > > > > >   "dimensions": [
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.REPORT_DATE",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "REPORT_DATE",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.SEARCH_TYPE",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "SEARCH_TYPE",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.HOTEL_ID",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "HOTEL_ID",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.CHILD_ID",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "CHILD_ID",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.COUNTRY",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "COUNTRY",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.DEVICE_TYPE",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "DEVICE_TYPE",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.STAY_LENGTH",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "STAY_LENGTH",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.TRUE_RANK_AG",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "TRUE_RANK_AG",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.ROOM_BUNDLE",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "ROOM_BUNDLE",
> > > > > >       "derived": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "DEFAULT.HPA_REPORTING2.SITE_ID",
> > > > > >       "table": "DEFAULT.HPA_REPORTING2",
> > > > > >       "column": "SITE_ID",
> > > > > >       "derived": null
> > > > > >     }
> > > > > >   ],
> > > > > >   "measures": [
> > > > > >     {
> > > > > >       "name": "_COUNT_",
> > > > > >       "function": {
> > > > > >         "expression": "COUNT",
> > > > > >         "parameter": {
> > > > > >           "type": "constant",
> > > > > >           "value": "1",
> > > > > >           "next_parameter": null
> > > > > >         },
> > > > > >         "returntype": "bigint"
> > > > > >       },
> > > > > >       "dependent_measure_ref": null
> > > > > >     },
> > > > > >     {
> > > > > >       "name": "CLICKS",
> > > > > >       "function": {
> > > > > >         "expression": "SUM",
> > > > > >         "parameter": {
> > > > > >           "type": "column",
> > > > > >           "value": "CLICKS",
> > > > > >           "next_parameter": null
> > > > > >         },
> > > > > >         "returntype": "decimal"
> > > > > >       },
> > > > > >       "dependent_measure_ref": null
> > > > > >     }
> > > > > >   ],
> > > > > >   "rowkey": {
> > > > > >     "rowkey_columns": [
> > > > > >       {
> > > > > >         "column": "REPORT_DATE",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "SEARCH_TYPE",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "HOTEL_ID",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "CHILD_ID",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "COUNTRY",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "DEVICE_TYPE",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "STAY_LENGTH",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "TRUE_RANK_AG",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "ROOM_BUNDLE",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       },
> > > > > >       {
> > > > > >         "column": "SITE_ID",
> > > > > >         "encoding": "dict",
> > > > > >         "isShardBy": false
> > > > > >       }
> > > > > >     ]
> > > > > >   },
> > > > > >   "signature": "KixlKWxevr6mO+UlSaR5ig==",
> > > > > >   "last_modified": 1470070273935,
> > > > > >   "model_name": "hpa_reporting_model2",
> > > > > >   "null_string": null,
> > > > > >   "hbase_mapping": {
> > > > > >     "column_family": [
> > > > > >       {
> > > > > >         "name": "F1",
> > > > > >         "columns": [
> > > > > >           {
> > > > > >             "qualifier": "M",
> > > > > >             "measure_refs": [
> > > > > >               "_COUNT_",
> > > > > >               "CLICKS"
> > > > > >             ]
> > > > > >           }
> > > > > >         ]
> > > > > >       }
> > > > > >     ]
> > > > > >   },
> > > > > >   "aggregation_groups": [
> > > > > >     {
> > > > > >       "includes": [
> > > > > >         "REPORT_DATE",
> > > > > >         "SEARCH_TYPE",
> > > > > >         "HOTEL_ID",
> > > > > >         "CHILD_ID",
> > > > > >         "COUNTRY",
> > > > > >         "DEVICE_TYPE",
> > > > > >         "STAY_LENGTH",
> > > > > >         "TRUE_RANK_AG",
> > > > > >         "ROOM_BUNDLE",
> > > > > >         "SITE_ID"
> > > > > >       ],
> > > > > >       "select_rule": {
> > > > > >         "hierarchy_dims": [],
> > > > > >         "mandatory_dims": [
> > > > > >           "SITE_ID",
> > > > > >           "CHILD_ID"
> > > > > >         ],
> > > > > >         "joint_dims": [
> > > > > >           [
> > > > > >             "ROOM_BUNDLE",
> > > > > >             "TRUE_RANK_AG"
> > > > > >           ]
> > > > > >         ]
> > > > > >       }
> > > > > >     }
> > > > > >   ],
> > > > > >   "notify_list": [],
> > > > > >   "status_need_notify": [
> > > > > >     "ERROR",
> > > > > >     "DISCARDED",
> > > > > >     "SUCCEED"
> > > > > >   ],
> > > > > >   "partition_date_start": 0,
> > > > > >   "partition_date_end": 3153600000000,
> > > > > >   "auto_merge_time_ranges": [
> > > > > >     604800000,
> > > > > >     2419200000
> > > > > >   ],
> > > > > >   "retention_range": 0,
> > > > > >   "engine_type": 2,
> > > > > >   "storage_type": 2,
> > > > > >   "override_kylin_properties": {}
> > > > > > }
> > > > > >
> > > > > > On Mon, Aug 1, 2016 at 8:02 PM, ShaoFeng Shi <
> > shaofengshi@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jason,
> > > > > > >
> > > > > > > As Yiming mentioned, the cube design matters for the
> performance
> > of
> > > > > both
> > > > > > > build and query; please check "Optimize Cube" in the document
> web
> > > > page
> > > > > > and
> > > > > > > do optimizaiton as much as possible;
> > > > > > >
> > > > > > > Besides, the cluster's capacity and Hadoop configuration
is
> also
> > an
> > > > > > > important factor; Try to identify the bottleneck and then
> > optimize
> > > or
> > > > > add
> > > > > > > capacity.
> > > > > > >
> > > > > > > From 1.5 Kylin ships with two cubing algorithm; the steps
> "Build
> > > > > > > N-Dimension Cuboid" are the legacy "Layered" cubing algorithm;
> > They
> > > > > will
> > > > > > be
> > > > > > > skipped when Kylin selects to use the new "Fast" cubing
> > algorithm,
> > > > > which
> > > > > > is
> > > > > > > the "Build Cube" step after them. Please click the hadoop
link
> in
> > > > that
> > > > > > step
> > > > > > > to inspect the MR job's statistics;
> > > > > > >
> > > > > > > Hope this helps to some extend;
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2016-08-02 8:44 GMT+08:00 Yiming Liu <liuyiming.vip@gmail.com
> >:
> > > > > > >
> > > > > > > > Hi Jason,
> > > > > > > >
> > > > > > > > Cube design is the performance key for Kylin, not
only query,
> > but
> > > > > also
> > > > > > > cube
> > > > > > > > building process. How to select dimensions, how to
define the
> > > > > > > relationship
> > > > > > > > between dimensions, how to select encode method, how
to
> define
> > > > > measure,
> > > > > > > > even how to choose the Hbase key order will have a
> significant
> > > > impact
> > > > > > on
> > > > > > > > performance.  There are quite a few wonderful documents
> > > introducing
> > > > > how
> > > > > > > to
> > > > > > > > do this, http://kylin.apache.org/docs15/ .
> > > > > > > >
> > > > > > > > One more thing, if you could share your cube design,
you
> would
> > > get
> > > > > help
> > > > > > > > easier here.
> > > > > > > >
> > > > > > > > 2016-08-02 7:20 GMT+08:00 Jason Hale <jason@koddi.com>:
> > > > > > > >
> > > > > > > > > I'm setting up a test case for a portion of our
dataset, to
> > > > > evaluate
> > > > > > > > Kylin
> > > > > > > > > and I'm not seeing the performance that I would
expect.
> > > > > > > > >
> > > > > > > > > The cube building process is taking about 5-6
hours with
> > > > > ~69,000,000
> > > > > > > > > records and 10 dimensions. I'm not sure if that's
the
> > expected
> > > > > build
> > > > > > > > time,
> > > > > > > > > but the other problem is the query performance
after
> building
> > > the
> > > > > > cube.
> > > > > > > > >
> > > > > > > > > All queries were tested with a very simple query
(e.g.
> SELECT
> > > > > > > SUM(clicks)
> > > > > > > > > FROM reporting GROUP BY search_type)
> > > > > > > > >
> > > > > > > > > Grouping by 1 or 2 dimensions gives me very responsive
> > queries
> > > > > > (under 2
> > > > > > > > > seconds), but adding more dimensions drastically
increases
> > the
> > > > > query
> > > > > > > time
> > > > > > > > > (over 1 minute and it times out through hbase).
I would
> > expect
> > > > > these
> > > > > > > > > queries to have all similar query times since
they should
> > query
> > > > the
> > > > > > > > > respective cuboid, so I'm not sure why the performance
> would
> > > > > suffer.
> > > > > > I
> > > > > > > > > didn't set up any special rules for the cube,
but during
> the
> > > > build
> > > > > > > > process
> > > > > > > > > it showed all the N-dimension cubes and the log
simply said
> > > > > > 'skipped'.
> > > > > > > > >
> > > > > > > > > Is there something I'm missing in the configuration?
> > > > > > > > >
> > > > > > > > > I have a HDP cluster with 3 nodes and 1 client
node on
> which
> > > > Kylin
> > > > > is
> > > > > > > > > installed. Do I need to adjust the hadoop configuration.
> I'm
> > > > using
> > > > > > most
> > > > > > > > of
> > > > > > > > > the default HDP settings.
> > > > > > > > >
> > > > > > > > > What more information can I provide?
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > With Warm regards
> > > > > > > >
> > > > > > > > Yiming Liu (刘一鸣)
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Shaofeng Shi
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > >
> > > > > Shaofeng Shi
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi
> > >
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>



-- 
Regards,

*Bin Mahone | 马洪宾*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message