kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: new Kudu benchmarks
Date Mon, 08 Jan 2018 17:53:11 GMT
Thanks for making the updates. I tweeted it from my account and from
@ApacheKudu. feel free to retweet!


On Sat, Jan 6, 2018 at 1:10 PM, Boris Tyukin <boris@boristyukin.com> wrote:

> thanks Todd, updated my post with that info and also changes title a bit.
> thanks again for your feedback! look forward to new releases coming up!
> Boris
> On Fri, Jan 5, 2018 at 9:08 PM, Todd Lipcon <todd@cloudera.com> wrote:
>> On Fri, Jan 5, 2018 at 5:50 PM, Boris Tyukin <boris@boristyukin.com>
>> wrote:
>>> Hi Todd,
>>> thanks for your feedback! sure will be happy to update my post with your
>>> suggestions. I am not sure Apache Parquet will be clear though as some
>>> might understand it as using parquet files with Hive or Spark. What do you
>>> think about "Impala on Kudu vs Impala on Parquet"? Realistically, for BI
>>> users, Impala is the only option now with Kudu. Not many typical users will
>>> use Kudu API clients or even Spark and Hive serde for Kudu does not exist.
>> I think "Impala on Kudu vs Parquet" or "Impala Storage Comparison: Kudu
>> vs Parquet" or something would be a reasonable title.
>>> As for decimals, this is exciting news. Where can I found info about
>>> timestamp support? I saw this JIRA
>>> https://issues.apache.org/jira/browse/IMPALA-5137
>>> but I was a bit confused by the actual change. It looked like a
>>> workaround to do a conversion on the fly for impala but not actually store
>>> proper timestamps in Kudu. Maybe I misread that. I thought the idea was to
>>> add a proper support in Kudu so timestamp can be used as a type with other
>>> clients not only Impala. If you can clarify that, it would be great
>> What we implemented is "proper" timestamp support in Kudu, but you're
>> right that there is some conversion going on under the hood. The reasoning
>> is that Impala internally uses a 96-bit timestamp representation which
>> supports a very large range of dates at nanosecond precision. This is more
>> than is required by the SQL standard and doesn't match the timestamp
>> representation used by other ecosystem components. As far as I know, Impala
>> is planning on moving to a 64-bit timestamp representation with microsecond
>> precision, so that's what Kudu implemented internally. With 64 bits there
>> is still enough range to store dates for 584,554 years at microsecond
>> precision.
>> I think https://impala.apache.org/docs/build/html/topics/impal
>> a_timestamp.html has some info about Kudu compatibility and limitations.
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera

Todd Lipcon
Software Engineer, Cloudera

View raw message