kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size
Date Fri, 25 May 2018 08:32:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490410#comment-16490410
] 

liyang commented on KYLIN-3071:
-------------------------------

I see the difference between Growing Dict and Reuse Dict. It is useful.

However there is a penalty of Reuse Dict, which could impact segment pruning at query time.
Currently dictionary is used to transform filters. For example, a filter likeĀ {{A='non-exist-value'}}
would be transformed to {{false}}, and such segment can be pruned to improve query performance.
Having extra values in dictionary will weaken the effectiveness of such pruning.

Given the above side effect, I'd suggest the Reuse Dict feature be off by default.

> Add config to reuse dict to reduce dict size 
> ---------------------------------------------
>
>                 Key: KYLIN-3071
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3071
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Metadata
>            Reporter: Yang Hao
>            Assignee: Yang Hao
>            Priority: Major
>             Fix For: Future
>
>         Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not enabled, it only
use the history dict which is equal, it may generate many dict. We should supply a config
to use contains instead of equal to reuse old dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message