kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: new Kudu benchmarks
Date Mon, 08 Jan 2018 18:54:17 GMT
awesome, thanks Todd!

On Mon, Jan 8, 2018 at 12:53 PM, Todd Lipcon <todd@cloudera.com> wrote:

> Thanks for making the updates. I tweeted it from my account and from
> @ApacheKudu. feel free to retweet!
>
> -Todd
>
> On Sat, Jan 6, 2018 at 1:10 PM, Boris Tyukin <boris@boristyukin.com>
> wrote:
>
>> thanks Todd, updated my post with that info and also changes title a bit.
>> thanks again for your feedback! look forward to new releases coming up!
>>
>> Boris
>>
>> On Fri, Jan 5, 2018 at 9:08 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>
>>> On Fri, Jan 5, 2018 at 5:50 PM, Boris Tyukin <boris@boristyukin.com>
>>> wrote:
>>>
>>>> Hi Todd,
>>>>
>>>> thanks for your feedback! sure will be happy to update my post with
>>>> your suggestions. I am not sure Apache Parquet will be clear though as some
>>>> might understand it as using parquet files with Hive or Spark. What do you
>>>> think about "Impala on Kudu vs Impala on Parquet"? Realistically, for BI
>>>> users, Impala is the only option now with Kudu. Not many typical users will
>>>> use Kudu API clients or even Spark and Hive serde for Kudu does not exist.
>>>>
>>>
>>> I think "Impala on Kudu vs Parquet" or "Impala Storage Comparison: Kudu
>>> vs Parquet" or something would be a reasonable title.
>>>
>>>
>>>>
>>>> As for decimals, this is exciting news. Where can I found info about
>>>> timestamp support? I saw this JIRA
>>>> https://issues.apache.org/jira/browse/IMPALA-5137
>>>>
>>>> but I was a bit confused by the actual change. It looked like a
>>>> workaround to do a conversion on the fly for impala but not actually store
>>>> proper timestamps in Kudu. Maybe I misread that. I thought the idea was to
>>>> add a proper support in Kudu so timestamp can be used as a type with other
>>>> clients not only Impala. If you can clarify that, it would be great
>>>>
>>>
>>> What we implemented is "proper" timestamp support in Kudu, but you're
>>> right that there is some conversion going on under the hood. The reasoning
>>> is that Impala internally uses a 96-bit timestamp representation which
>>> supports a very large range of dates at nanosecond precision. This is more
>>> than is required by the SQL standard and doesn't match the timestamp
>>> representation used by other ecosystem components. As far as I know, Impala
>>> is planning on moving to a 64-bit timestamp representation with microsecond
>>> precision, so that's what Kudu implemented internally. With 64 bits there
>>> is still enough range to store dates for 584,554 years at microsecond
>>> precision.
>>>
>>> I think https://impala.apache.org/docs/build/html/topics/impal
>>> a_timestamp.html has some info about Kudu compatibility and limitations.
>>>
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
View raw message