Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9682217204 for ; Thu, 6 Nov 2014 06:49:58 +0000 (UTC) Received: (qmail 86392 invoked by uid 500); 6 Nov 2014 06:49:56 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 86339 invoked by uid 500); 6 Nov 2014 06:49:56 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 86325 invoked by uid 99); 6 Nov 2014 06:49:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Nov 2014 06:49:56 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nitinpawar432@gmail.com designates 209.85.160.175 as permitted sender) Received: from [209.85.160.175] (HELO mail-yk0-f175.google.com) (209.85.160.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Nov 2014 06:49:30 +0000 Received: by mail-yk0-f175.google.com with SMTP id q9so1038028ykb.6 for ; Wed, 05 Nov 2014 22:49:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=wjxW8dm6DiCfkxzPv0DNxiWUZ/fLsf6Jkfv+OqcgeCg=; b=EjFrXF0qZ+lRoehIUXxqjESld7ahSUCJiJCbbVy514xTWloJyy++NW4BCZne24O/Q6 eXnsOwW6gnAdkw2zPR2oGNu6KHNZ1QajB5Ht33zkqJp/aqPCun96Gvkg4HE/SefzlJq0 6APpzUf+xRHUgqoEqS7/018UprrNHy6Y2xIr+tkTDvoz5UvwdUz0dqnA5ps43fwIdEql x+MBuBMva7CZRfrQ/jSIlMgoZeb3DeWBlIJNrMcaTvTqhSRWmX2kFPXWg9wvVFNPe9QW 9TzdsAcaVSg1YMqcKocSj1oFPcPAOVJAnDfNPZ6Dq5tRA0RaeuSIKJasvdF6EZFScDVJ LiRQ== MIME-Version: 1.0 X-Received: by 10.236.61.11 with SMTP id v11mr1878899yhc.55.1415256569134; Wed, 05 Nov 2014 22:49:29 -0800 (PST) Received: by 10.170.61.22 with HTTP; Wed, 5 Nov 2014 22:49:29 -0800 (PST) In-Reply-To: <4B345C5B-64B3-4459-B4DB-5F2DBB73285A@hortonworks.com> References: <9905B2DC-0609-46E6-97DA-F10FF43E9942@hortonworks.com> <4B345C5B-64B3-4459-B4DB-5F2DBB73285A@hortonworks.com> Date: Thu, 6 Nov 2014 12:19:29 +0530 Message-ID: Subject: Re: from_unixtime() and epoch definition From: Nitin Pawar To: "user@hive.apache.org" Content-Type: multipart/alternative; boundary=089e0158bb8c11c50a05072b1974 X-Virus-Checked: Checked by ClamAV on apache.org --089e0158bb8c11c50a05072b1974 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable May be a JIRA ? I remember having my own UDF for doing this. If possible I will share the code On Thu, Nov 6, 2014 at 6:22 AM, Jason Dere wrote: > Hive should probably at least provide a timezone option to > from_unixtime(). > As you mentioned, Hive doesn't really do any timezone handling, just > assumes things are in the system's local timezone. It will be a bit of a > bigger project to add better time zone handling to Hive timestamps. > > > On Nov 5, 2014, at 7:18 AM, Maciek wrote: > > I see=E2=80=A6 and confirm, it's consistent with Linux/Unix output I get: > date -r 0 > Thu 1 Jan 1970 01:00:00 IST > > date > Wed 5 Nov 2014 14:49:52 GMT > Got some digging and it actually makes sense. Turns out Ireland didn't > observe daylight saving time in years 1968-1972 as set permanently to > GMT+1=3DIST. > > Anyway, back to Hive > I'm trying to convert unix_times to UTC (using from_unixtime UDF )but due > to the issue it I'm getting different results on different servers (TZ > settings) > Is there any way influence that behaviour without changing timezone on th= e > server? > > Oracle for that instance offers a good few options to facilitate timezone > conversion, among the others: > 'AT TIME ZONE [GMT]' clause > ALTER SESSION SET TIME_ZONE [=3D 'GMT'] > or > to_timestamp_tz() function > > Currently it seems, the only way to perform this conversion is to detect > server settings first (won't work at all for some cases like though JDBC > connection I think) and apply the shift during the process. > > Would be really nice if Hive offers some elegant way to support this. > I'm thinking of similar ALTER SESSION statement equivalent, maybe > parameter SET in hive or extra parameter for the from_unixtime() Hive > function? > > On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere wrote= : > >> >> As Nitin mentions, the behavior is "to a string representing the >> timestamp of that moment in the current system time zone". What are the >> timezone settings on your machine? >> >> $ TZ=3D"GMT" date -r 0 >> Thu Jan 1 00:00:00 GMT 1970 >> >> $ TZ=3D"UTC" date -r 0 >> Thu Jan 1 00:00:00 UTC 1970 >> >> $ TZ=3D"Europe/London" date -r 0 >> Thu Jan 1 01:00:00 BST 1970 >> >> $ TZ=3D"Europe/Dublin" date -r 0 >> Thu Jan 1 01:00:00 IST 1970 >> >> On Nov 3, 2014, at 12:50 PM, Maciek wrote: >> >> I'd consider this behaviour as a bug and would like to raise it as such. >> Is there anyone to confirm it's the same on Hive 0.14? >> >> On Fri, Oct 31, 2014 at 3:41 PM, Maciek wrote: >> >>> Actually confirmed! It's down to the timezone settings >>> I've moved temporarily server/client settings to 'Atlantic/Reykjavik' >>> (no change in time comparing to what I was on (GMT), but it's permanent= UTC >>> and as such doesn't observe daylight saving. >>> I believe this shouldn't matter (see my points from previous mail) but >>> apparently there's an issue with it. >>> Not sure how to deal with this situation (can't just change TZ settings >>> everywhere because of Hive) and don't want to hardcode anything. >>> I'm on Hive 0.13. >>> Does Hive 0.14 provide better support for TimeZones? >>> >>> >>> On Fri, Oct 31, 2014 at 3:25 PM, Maciek wrote: >>> >>>> Thought about that myself based on my prior (bad) experience when trie= d >>>> to working with timezones in Hive (functionality pretty much doesn't e= xists) >>>> That shouldn't be the case here though, here's why: >>>> in Oracle [timestamp with timezone] can be adjusted when sent/displaye= d >>>> on the client based on client's settings. This may be also relevant if= the >>>> timestamp in question would fall onto client's daily saving time perio= d. >>>> This behaviour would make sense to me, however: >>>> >>>> =E2=80=A2 this is server, not client settings we're talking about here >>>> =E2=80=A2 the server and client do reside in the same timezone anyway,= which is >>>> currently GMT [UTC] >>>> >>>> =E2=80=A2 while we observe the daily saving here [Dublin] the time in = question >>>> ("1970-01-01 00:00:00") is not in that period, neither the time I'm se= nding >>>> the query (now). >>>> >>>> >>>> >>>> Based on all above, I don't see the reason the time gets shifted by on= e >>>> hour, but I realise the issue might be down to the general problems in >>>> Hive' implementation of timezones=E2=80=A6 >>>> >>>> On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar >>>> wrote: >>>> >>>>> In hive from_unixtime is returned from the timezone which you belong = to >>>>> "From document : from_unixtime(bigint unixtime[, string format]) : >>>>> Converts the number of seconds from unix epoch (1970-01-01 00:00:00 U= TC) to >>>>> a string representing the timestamp of that moment in the current sys= tem >>>>> time zone in the format of "1970-01-01 00:00:00". >>>>> >>>>> if possible can you also check by changing the timezone to UTC on you= r >>>>> machine? >>>>> >>>>>> >>>>>> On Fri, Oct 31, 2014 at 12:00 PM, Maciek wrote: >>>>>> >>>>>>> Any reason why >>>>>>> >>>>>>> select from_unixtime(0) t0 FROM =E2=80=A6 >>>>>>> gives >>>>>>> >>>>>>> 1970-01-01 01:00:00 >>>>>>> ? >>>>>>> >>>>>>> By all available definitions (epoch, from_unixtime etc..) I would >>>>>>> expect it to be 1970-01-01 00:00:00=E2=80=A6? >>>>>>> >>>>>> > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential= , > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified th= at > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediate= ly > and delete it from your system. Thank You. > --=20 Nitin Pawar --089e0158bb8c11c50a05072b1974 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
May be a JIRA ?

I remember having my ow= n UDF for doing this. If possible I will share the code=C2=A0

On Thu, Nov 6, 2014= at 6:22 AM, Jason Dere <jdere@hortonworks.com> wrote:
H= ive should probably at least provide a timezone option to from_unixtime().= =C2=A0
As you mentioned, Hive doesn't really do any timezone = handling, just assumes things are in the system's local timezone. It wi= ll be a bit of a bigger project to add better time zone handling to Hive ti= mestamps.


On Nov 5= , 2014, at 7:18 AM, Maciek <maciek@sonra.io> wrote:

=
I see=E2=80=A6 and confirm, it's consiste= nt with Linux/Unix output I get:
date -r 0
date
Wed =C2=A05 Nov 2014 14:49:52 GMT
Got some digging and it actually makes sense. Tur= ns out Ireland didn't observe daylight saving time in years 1968-1972 a= s set permanently to GMT+1=3DIST.

Anyway, back to Hive
I'm trying to convert unix_times to UTC (using from_unixtime UDF )but due to the = issue it I'm getting different results on different servers (TZ setting= s)
Is there any way influen= ce that behaviour without changing timezone on the server?

=
Oracle for that instance offers a good = few options to facilitate timezone conversion, among the others:
'AT TIME ZONE [GMT]' clause
ALTER SESSION SET TIME_ZONE [=3D 'GMT']
or
to_timestamp_tz() function

Currently it seems, the only way to perform this conversion is to det= ect server settings first (won't work at all for some cases like though= JDBC connection I think) and apply the shift during the process.

Would be really nice if Hive offers some elegant way to su= pport this.
I'm thinking of similar ALTER SESSION statement equivalen= t, maybe parameter SET in hive or extra parameter for the from_unixtime() H= ive function?

On Mon, Nov 3, 2014 at 10:33 PM, Jason Dere <= span dir=3D"ltr"><jdere@hortonworks.com> wrote:

As Nitin mentions, the be= havior is "to a string representing the timestamp of that moment in th= e current system time zone".=C2=A0 What are the timezone settings on y= our machine?=C2=A0

$ TZ=3D"GMT" dat= e -r 0
Thu Jan =C2=A01 00:00:00 GMT 1970

$ TZ=3D"UTC" date -r 0
Thu Jan =C2=A01 00:00:00 UTC 19= 70

$ TZ=3D"Europe/London" date -r 0
Thu Jan =C2=A01 01:00:00 BST 1970

$ TZ=3D"Europe/Dublin" date -r 0
Thu Jan =C2=A01 01:00= :00 IST 1970

On Nov 3, 2014, at 12:50 PM, Mac= iek <maciek@sonra.i= o> wrote:

I'= d consider this behaviour as a bug and would like to raise it as such.
= Is there anyone to confirm it's the same on Hive 0.14?

On Fri, Oct 31, 2014 a= t 3:41 PM, Maciek <maciek@sonra.io> wrote:
Actually confirmed! It's down to the timezone setting= s
I've moved temporarily server/client settings to 'Atlantic/Re= ykjavik' (no change in time comparing to what I was on (GMT), but it= 9;s permanent UTC and as such doesn't observe daylight saving.
I believe this shouldn't matter (see my points from previous mail= ) but apparently there's an issue with it.
Not sure how to de= al with this situation (can't just change TZ settings everywhere becaus= e of Hive) and don't want to hardcode anything.
I'm on Hi= ve 0.13.
Does Hive 0.14 provide better support for TimeZones?


On Fri, Oct 31, 2014 at 3:25 PM, Maciek <maciek@sonra.io> wrote:
Th= ought about that myself based on my prior (bad) experience when tried to wo= rking with timezones in Hive (functionality pretty much doesn't exists)=
That shouldn't be the case here though= , here's why:
in Oracle [timestamp with= timezone] can be adjusted when sent/displayed on the client based on clien= t's settings. This may be also relevant if the timestamp in question wo= uld fall onto client's daily saving time period. This behaviour would m= ake sense to me, however:
=E2=80=A2 this is serv= er, not client settings we're talking about here
=E2=80=A2 the ser= ver and client do reside in the same timezone anyway, which is currently GM= T [UTC]
=E2=80=A2 while we observe the daily saving here [Dublin] th= e time in question ("1970-01-01 00:00:00") is not in that period,= neither the time I'm sending the query (now).
=C2=A0
Based on all = above, I don't see the reason the time gets shifted by one hour, but I = realise the issue might be down to the general problems in Hive' implem= entation of timezones=E2=80=A6

On Fri, Oct 31, 2014 at 12:26 PM, Nitin Pawar <nitinpawar432@gmail.com> wrote:
In hive from_unixtime is returned from the timezon= e which you belong to
"From document :=C2=A0from_unixtime(bigint u= nixtime[, string format]) : Converts the number of seconds from unix epoch = (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that mo= ment in the current system time zone in the format of "1970-01-01 00:0= 0:00".=C2=A0

if possible can you also check b= y changing the timezone to UTC on your machine?

On= Fri, Oct 31, 2014 at 12:00 PM, Maciek <maciek@sonra.io> wrote= :
Any reason why

select from_unixtime(0) t0 FROM =E2=80=A6

gives

1970-01-01 01:00:00

?

By all available definitions (epoch, from_u= nixtime etc..) I would expect it to be 1970-01-01 00:00:00=E2=80=A6?


CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u.



--
Nitin Pawar
--089e0158bb8c11c50a05072b1974--