hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciek <mac...@sonra.io>
Subject Re: from_unixtime() and epoch definition
Date Thu, 06 Nov 2014 10:14:33 GMT
@Jason:
re. "Hive (…) just assumes things are in the system's local timezone", just
to clarify - this is not true in case of conversions (from_unixtime()) as
it respects the local system TZ settings hence the problem.
TZ itself is a very hairy subject and would definitely be a big
undertaking. Extending from_unixtime seems like easiest solution for now.
Happy to do ER in JIRA but haven't done this for before...

@Nitin
Would be very grateful if you're able to dig it out! Thanks!

Best Regards


On Thu, Nov 6, 2014 at 7:48 AM, Jason Dere <jdere@hortonworks.com> wrote:

> That would be great!
>
> On Nov 5, 2014, at 10:49 PM, Nitin Pawar <nitinpawar432@gmail.com> wrote:
>
> May be a JIRA ?
>
> I remember having my own UDF for doing this. If possible I will share the
> code
>
> On Thu, Nov 6, 2014 at 6:22 AM, Jason Dere <jdere@hortonworks.com> wrote:
>
>> Hive should probably at least provide a timezone option to
>> from_unixtime().
>> As you mentioned, Hive doesn't really do any timezone handling, just
>> assumes things are in the system's local timezone. It will be a bit of a
>> bigger project to add better time zone handling to Hive timestamps.
>>
>>
>> On Nov 5, 2014, at 7:18 AM, Maciek <maciek@sonra.io> wrote:
>>
>> I see… and confirm, it's consistent with Linux/Unix output I get:
>> date -r 0
>> Thu  1 Jan 1970 01:00:00 IST
>>
>> date
>> Wed  5 Nov 2014 14:49:52 GMT
>> Got some digging and it actually makes sense. Turns out Ireland didn't
>> observe daylight saving time in years 1968-1972 as set permanently to
>> GMT+1=IST.
>>
>> Anyway, back to Hive
>> I'm trying to convert unix_times to UTC (using from_unixtime UDF )but
>> due to the issue it I'm getting different results on different servers (TZ
>> settings)
>> Is there any way influence that behaviour without changing timezone on
>> the server?
>>
>> Oracle for that instance offers a good few options to facilitate timezone
>> conversion, among the others:
>> 'AT TIME ZONE [GMT]' clause
>> ALTER SESSION SET TIME_ZONE [= 'GMT']
>> or
>> to_timestamp_tz() function
>>
>> Currently it seems, the only way to perform this conversion is to detect
>> server settings first (won't work at all for some cases like though JDBC
>> connection I think) and apply the shift during the process.
>>
>> Would be really nice if Hive offers some elegant way to support this.
>> I'm thinking of similar ALTER SESSION statement equivalent, maybe
>> parameter SET in hive or extra parameter for the from_unixtime() Hive
>> function?
>>
>> On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere <jdere@hortonworks.com>
>> wrote:
>>
>>>
>>> As Nitin mentions, the behavior is "to a string representing the
>>> timestamp of that moment in the current system time zone".  What are the
>>> timezone settings on your machine?
>>>
>>> $ TZ="GMT" date -r 0
>>> Thu Jan  1 00:00:00 GMT 1970
>>>
>>> $ TZ="UTC" date -r 0
>>> Thu Jan  1 00:00:00 UTC 1970
>>>
>>> $ TZ="Europe/London" date -r 0
>>> Thu Jan  1 01:00:00 BST 1970
>>>
>>> $ TZ="Europe/Dublin" date -r 0
>>> Thu Jan  1 01:00:00 IST 1970
>>>
>>> On Nov 3, 2014, at 12:50 PM, Maciek <maciek@sonra.io> wrote:
>>>
>>> I'd consider this behaviour as a bug and would like to raise it as such.
>>> Is there anyone to confirm it's the same on Hive 0.14?
>>>
>>> On Fri, Oct 31, 2014 at 3:41 PM, Maciek <maciek@sonra.io> wrote:
>>>
>>>> Actually confirmed! It's down to the timezone settings
>>>> I've moved temporarily server/client settings to 'Atlantic/Reykjavik'
>>>> (no change in time comparing to what I was on (GMT), but it's permanent UTC
>>>> and as such doesn't observe daylight saving.
>>>> I believe this shouldn't matter (see my points from previous mail) but
>>>> apparently there's an issue with it.
>>>> Not sure how to deal with this situation (can't just change TZ settings
>>>> everywhere because of Hive) and don't want to hardcode anything.
>>>> I'm on Hive 0.13.
>>>> Does Hive 0.14 provide better support for TimeZones?
>>>>
>>>>
>>>> On Fri, Oct 31, 2014 at 3:25 PM, Maciek <maciek@sonra.io> wrote:
>>>>
>>>>> Thought about that myself based on my prior (bad) experience when
>>>>> tried to working with timezones in Hive (functionality pretty much doesn't
>>>>> exists)
>>>>> That shouldn't be the case here though, here's why:
>>>>> in Oracle [timestamp with timezone] can be adjusted when
>>>>> sent/displayed on the client based on client's settings. This may be
also
>>>>> relevant if the timestamp in question would fall onto client's daily
saving
>>>>> time period. This behaviour would make sense to me, however:
>>>>>
>>>>> • this is server, not client settings we're talking about here
>>>>> • the server and client do reside in the same timezone anyway, which
>>>>> is currently GMT [UTC]
>>>>>
>>>>> • while we observe the daily saving here [Dublin] the time in question
>>>>> ("1970-01-01 00:00:00") is not in that period, neither the time I'm sending
>>>>> the query (now).
>>>>>
>>>>>
>>>>>
>>>>> Based on all above, I don't see the reason the time gets shifted by
>>>>> one hour, but I realise the issue might be down to the general problems
in
>>>>> Hive' implementation of timezones…
>>>>>
>>>>> On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar <nitinpawar432@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> In hive from_unixtime is returned from the timezone which you belong
>>>>>> to
>>>>>> "From document : from_unixtime(bigint unixtime[, string format])
:
>>>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00
UTC) to
>>>>>> a string representing the timestamp of that moment in the current
system
>>>>>> time zone in the format of "1970-01-01 00:00:00".
>>>>>>
>>>>>> if possible can you also check by changing the timezone to UTC on
>>>>>> your machine?
>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 31, 2014 at 12:00 PM, Maciek <maciek@sonra.io>
wrote:
>>>>>>>
>>>>>>>> Any reason why
>>>>>>>>
>>>>>>>> select from_unixtime(0) t0 FROM …
>>>>>>>> gives
>>>>>>>>
>>>>>>>> 1970-01-01 01:00:00
>>>>>>>> ?
>>>>>>>>
>>>>>>>> By all available definitions (epoch, from_unixtime etc..)
I would
>>>>>>>> expect it to be 1970-01-01 00:00:00…?
>>>>>>>>
>>>>>>>

Mime
View raw message