kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: new Kudu benchmarks
Date Sat, 06 Jan 2018 21:10:38 GMT
thanks Todd, updated my post with that info and also changes title a bit.
thanks again for your feedback! look forward to new releases coming up!

Boris

On Fri, Jan 5, 2018 at 9:08 PM, Todd Lipcon <todd@cloudera.com> wrote:

> On Fri, Jan 5, 2018 at 5:50 PM, Boris Tyukin <boris@boristyukin.com>
> wrote:
>
>> Hi Todd,
>>
>> thanks for your feedback! sure will be happy to update my post with your
>> suggestions. I am not sure Apache Parquet will be clear though as some
>> might understand it as using parquet files with Hive or Spark. What do you
>> think about "Impala on Kudu vs Impala on Parquet"? Realistically, for BI
>> users, Impala is the only option now with Kudu. Not many typical users will
>> use Kudu API clients or even Spark and Hive serde for Kudu does not exist.
>>
>
> I think "Impala on Kudu vs Parquet" or "Impala Storage Comparison: Kudu vs
> Parquet" or something would be a reasonable title.
>
>
>>
>> As for decimals, this is exciting news. Where can I found info about
>> timestamp support? I saw this JIRA
>> https://issues.apache.org/jira/browse/IMPALA-5137
>>
>> but I was a bit confused by the actual change. It looked like a
>> workaround to do a conversion on the fly for impala but not actually store
>> proper timestamps in Kudu. Maybe I misread that. I thought the idea was to
>> add a proper support in Kudu so timestamp can be used as a type with other
>> clients not only Impala. If you can clarify that, it would be great
>>
>
> What we implemented is "proper" timestamp support in Kudu, but you're
> right that there is some conversion going on under the hood. The reasoning
> is that Impala internally uses a 96-bit timestamp representation which
> supports a very large range of dates at nanosecond precision. This is more
> than is required by the SQL standard and doesn't match the timestamp
> representation used by other ecosystem components. As far as I know, Impala
> is planning on moving to a 64-bit timestamp representation with microsecond
> precision, so that's what Kudu implemented internally. With 64 bits there
> is still enough range to store dates for 584,554 years at microsecond
> precision.
>
> I think https://impala.apache.org/docs/build/html/topics/
> impala_timestamp.html has some info about Kudu compatibility and
> limitations.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
View raw message