kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: AppendTrieDictionary with GlobalDictionary 1.6
Date Fri, 23 Jun 2017 00:36:21 GMT
For integer values, Global Dictionary is not needed.

So what you do is just set "integer:4" as the encoding in the dimension,
and leave blank for the global dictionary.

2017-06-23 6:30 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:

> Thanks ShaoFeng.
>
> so to clarify.  for UHC dimension.  It is integer.  So i can set encoding
> to integer and then also include it in GD for count distinct?  or leave it
> out of GD and add it as integer encoding only?
>
>
>
> On Wed, Jun 21, 2017 at 10:55 PM, ShaoFeng Shi <shaofengshi@apache.org>
> wrote:
>
>> Hi Sonny,
>>
>> I see; it is a defect: for one column Kylin at most use 1 dictionary, it
>> couldn't differenciate ordinary dict and Global dict when that column is
>> used in both dimension and measure.
>>
>> 25million is a Ultra High Cardinality dimension, it is not suitable for
>> dict as the dict size will beyond Java heap size. In this case, please use
>> fixed_length encoding; If that column is integer or long type, you can use
>> "integer" encoding. In the meanwhile, keep using GD for the count distinct
>> measure.
>>
>> 2017-06-22 13:37 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:
>>
>>> I see what you mean @ShaoFeng Shi.
>>>
>>> I noticed one of the measures I have defined is also a dimension.  So
>>> what can I do in this case?  it is both needed as a count distinct measure
>>> and dimension.  The typical dictionary gives java heap space error.  its
>>> approximately 25m unique keys.  Any ideas on how best kylin can handle
>>> this?  should I remove it as GD and add as dim & fix length?
>>>
>>> On Wed, Jun 21, 2017 at 10:33 PM, Sonny Heer <sonnyheer@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> No, not as a dimension.  Only for Count distinct measures.
>>>>
>>>>
>>>> On Wed, Jun 21, 2017 at 10:25 PM, ShaoFeng Shi <shaofengshi@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Sonny, are you using GlobalDictionary for a dimension? If so, pls
>>>>> change to use ordinary dictionary.
>>>>>
>>>>> The GlobalDictionary is a "one-way" dictionary, as it can only encode
>>>>> a String to an integer, it doesn't support decode the String from an
>>>>> integer. The main usage for GlobalDictionary is the precise Count Distinct,
>>>>> as bitmap only accepts integer as input, so Kylin use the GD to do the
>>>>> conversion.
>>>>>
>>>>> 2017-06-22 6:23 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:
>>>>>
>>>>>> After finally getting the global dictionary to work with building
the
>>>>>> cube there are now exceptions during query.
>>>>>>
>>>>>> ERROR in query:
>>>>>> "AppendTrieDictionary can't retrive value from id"
>>>>>>
>>>>>>
>>>>>> Here is where it ends up in the code::: ->
>>>>>>
>>>>>>     @Override
>>>>>>
>>>>>>     final protected T getValueFromIdImpl(int id) {
>>>>>>
>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>> can't retrive value from id");
>>>>>>
>>>>>>     }
>>>>>>
>>>>>>
>>>>>>     @Override
>>>>>>
>>>>>>     protected byte[] getValueBytesFromIdImpl(int id) {
>>>>>>
>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>> can't retrive value from id");
>>>>>>
>>>>>>     }
>>>>>>
>>>>>>
>>>>>>     @Override
>>>>>>
>>>>>>     protected int getValueBytesFromIdImpl(int id, byte[]
>>>>>> returnValue, int offset) {
>>>>>>
>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>> can't retrive value from id");
>>>>>>
>>>>>>     }
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Sonny S. Heer
>>>> Senior Software Engineer
>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>> Sonny S. Heer
>>> Senior Software Engineer
>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
>
> --
>
>
> Sonny S. Heer
> Senior Software Engineer
> m: 360-434-4354 h: 509-884-2574
>



-- 
Best regards,

Shaofeng Shi 史少锋

Mime
View raw message