kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruslan Dautkhanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-2353) Serialize BitmapCounter with distinct count
Date Fri, 05 Oct 2018 20:42:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640322#comment-16640322
] 

Ruslan Dautkhanov commented on KYLIN-2353:
------------------------------------------

[~kangkaisen] thank you for this great improvement. 
Would you recommend bitmapCounter for highly-cardinal columns? 
I assume it will work super fast for low-cardinal columns like `product type`, but would 
it work on highly cardinal columns, let's say if number of distintinct values in a column

`household_id` is *1 billion*, would Bitmap Counter and Kylin general handle 
`count(distinct household_id)` very well? 

> Serialize BitmapCounter with distinct count
> -------------------------------------------
>
>                 Key: KYLIN-2353
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2353
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Metadata
>    Affects Versions: v1.6.0
>            Reporter: kangkaisen
>            Assignee: kangkaisen
>            Priority: Major
>             Fix For: v2.0.0
>
>         Attachments: KYLIN-2353.patch
>
>
> Currently, we deserialize the bitmap whether we need to aggregate or not.
> Actually, we could serialize {{BitmapCounter}} with bitmap counter and delay to deserialize
bitmap until we need to aggregate bitmap and only get the counter for the bitmap when deserialize.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message