kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Yang <liy...@apache.org>
Subject Re: Timestamp related issues
Date Thu, 29 Oct 2015 10:08:12 GMT
> 1.      Is there any issue with Timestamp/Date values ?

Timestamp testing is very limited on 1.x branch. All use cases I knew about
uses date instead of timestamp.
The 2.x branch has much better timestamp support.

> 2.      For measures with distinct count, it uses approximations with
certain error rates, lowest of which is <1.22%. Does this guarantee that
counts would be accurate ?

The short answer is no 100% guarantee. The count distinct algorithm behind
this is HyperLogLog [1]. Its error follows a normal distribution. The "<
1.22%" is brief of saying for 99.7% out of all the results, the error is
<1.22% in theory. And there's still 0.3% results could go beyond the error.

[1] https://en.wikipedia.org/wiki/HyperLogLog

On Tue, Oct 27, 2015 at 12:45 PM, Chetan Dixit <Chetan_Dixit1@symantec.com>
wrote:

> Hello Kylin Team,
>
>
>
> We are facing following issues while using Kylin could you please help.
>
>
>
> 1.      Is there any issue with Timestamp/Date values ?
>
>                We see issues in queries using “WHERE columnname =
> timestamp ‘2015-07-23 10:30:00’ “ it does not return any results.
>
>                If we use “WHERE columnname = ‘2015-07-23 10:30:00’ “ it
> returns ERROR
>
>                If use timestamp column in projection list, it truncates
> the timestamp part i.e. 2015-07-23 10:30:00 to 2015-07-23 00:00:00
>
>
>
> 2.      For measures with distinct count, it uses approximations with
> certain error rates, lowest of which is <1.22%. Does this guarantee that
> counts would be accurate ?
>
>                We have seen for a count of 1000 results as 982, 1000 etc.
>
>
>
> Thanks,
>
> Chetan
>
>
>

Mime
View raw message