kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: new Kudu benchmarks
Date Sat, 06 Jan 2018 02:08:00 GMT
On Fri, Jan 5, 2018 at 5:50 PM, Boris Tyukin <boris@boristyukin.com> wrote:

> Hi Todd,
>
> thanks for your feedback! sure will be happy to update my post with your
> suggestions. I am not sure Apache Parquet will be clear though as some
> might understand it as using parquet files with Hive or Spark. What do you
> think about "Impala on Kudu vs Impala on Parquet"? Realistically, for BI
> users, Impala is the only option now with Kudu. Not many typical users will
> use Kudu API clients or even Spark and Hive serde for Kudu does not exist.
>

I think "Impala on Kudu vs Parquet" or "Impala Storage Comparison: Kudu vs
Parquet" or something would be a reasonable title.


>
> As for decimals, this is exciting news. Where can I found info about
> timestamp support? I saw this JIRA
> https://issues.apache.org/jira/browse/IMPALA-5137
>
> but I was a bit confused by the actual change. It looked like a workaround
> to do a conversion on the fly for impala but not actually store proper
> timestamps in Kudu. Maybe I misread that. I thought the idea was to add a
> proper support in Kudu so timestamp can be used as a type with other
> clients not only Impala. If you can clarify that, it would be great
>

What we implemented is "proper" timestamp support in Kudu, but you're right
that there is some conversion going on under the hood. The reasoning is
that Impala internally uses a 96-bit timestamp representation which
supports a very large range of dates at nanosecond precision. This is more
than is required by the SQL standard and doesn't match the timestamp
representation used by other ecosystem components. As far as I know, Impala
is planning on moving to a 64-bit timestamp representation with microsecond
precision, so that's what Kudu implemented internally. With 64 bits there
is still enough range to store dates for 584,554 years at microsecond
precision.

I think
https://impala.apache.org/docs/build/html/topics/impala_timestamp.html has
some info about Kudu compatibility and limitations.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message