kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: new Kudu benchmarks
Date Mon, 08 Jan 2018 18:54:17 GMT
awesome, thanks Todd!

On Mon, Jan 8, 2018 at 12:53 PM, Todd Lipcon <todd@cloudera.com> wrote:

> Thanks for making the updates. I tweeted it from my account and from
> @ApacheKudu. feel free to retweet!
> -Todd
> On Sat, Jan 6, 2018 at 1:10 PM, Boris Tyukin <boris@boristyukin.com>
> wrote:
>> thanks Todd, updated my post with that info and also changes title a bit.
>> thanks again for your feedback! look forward to new releases coming up!
>> Boris
>> On Fri, Jan 5, 2018 at 9:08 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>> On Fri, Jan 5, 2018 at 5:50 PM, Boris Tyukin <boris@boristyukin.com>
>>> wrote:
>>>> Hi Todd,
>>>> thanks for your feedback! sure will be happy to update my post with
>>>> your suggestions. I am not sure Apache Parquet will be clear though as some
>>>> might understand it as using parquet files with Hive or Spark. What do you
>>>> think about "Impala on Kudu vs Impala on Parquet"? Realistically, for BI
>>>> users, Impala is the only option now with Kudu. Not many typical users will
>>>> use Kudu API clients or even Spark and Hive serde for Kudu does not exist.
>>> I think "Impala on Kudu vs Parquet" or "Impala Storage Comparison: Kudu
>>> vs Parquet" or something would be a reasonable title.
>>>> As for decimals, this is exciting news. Where can I found info about
>>>> timestamp support? I saw this JIRA
>>>> https://issues.apache.org/jira/browse/IMPALA-5137
>>>> but I was a bit confused by the actual change. It looked like a
>>>> workaround to do a conversion on the fly for impala but not actually store
>>>> proper timestamps in Kudu. Maybe I misread that. I thought the idea was to
>>>> add a proper support in Kudu so timestamp can be used as a type with other
>>>> clients not only Impala. If you can clarify that, it would be great
>>> What we implemented is "proper" timestamp support in Kudu, but you're
>>> right that there is some conversion going on under the hood. The reasoning
>>> is that Impala internally uses a 96-bit timestamp representation which
>>> supports a very large range of dates at nanosecond precision. This is more
>>> than is required by the SQL standard and doesn't match the timestamp
>>> representation used by other ecosystem components. As far as I know, Impala
>>> is planning on moving to a 64-bit timestamp representation with microsecond
>>> precision, so that's what Kudu implemented internally. With 64 bits there
>>> is still enough range to store dates for 584,554 years at microsecond
>>> precision.
>>> I think https://impala.apache.org/docs/build/html/topics/impal
>>> a_timestamp.html has some info about Kudu compatibility and limitations.
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
> --
> Todd Lipcon
> Software Engineer, Cloudera

View raw message