kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruslan Dautkhanov <dautkha...@gmail.com>
Subject count distinct
Date Wed, 27 Jul 2016 22:04:41 GMT
Hello,

1)
How efficient is Kylin in materializing count distinct in its cubes?
We're more intrested in exact count distinct.

2) How effiecient is Kylin for wide datasets? We have around 700 dimensions.
Size of dataset - tens of billions records.
Is it feasible to run such a workload on, for example, a 10-node Hadoop
cluster?

3)  (This is a less critical question than the first two )
Does Kylin has a session-level setting to switch between approx and exact
count distinct?
Like Impala has a session-level setting APPX_COUNT_DISTINCT
So without changing application queries, users can switch if they're
intrerested
in approx or exact counts?


Thank you,
Ruslan Dautkhanov

Mime
View raw message