oozie-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Morel" <david.mo...@amakuru.net>
Subject Re: Questions about oozie timezone
Date Mon, 24 Aug 2015 13:03:48 GMT
On 16 Jun 2015, at 2:00, Laurent H wrote:

> I've got the same issue Jian, it's could be great to have an answer 
> oozie
> experts! ;)
>
> --
> Laurent HATIER - Consultant Big Data & Business Intelligence chez 
> CapGemini
> fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
> <http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
>
> 2015-06-15 12:51 GMT+02:00 朱健 <zhujian@jd.com>:
>
>> Hi,
>>
>> Thanks for read this email.
>>
>> I have used oozie for about 2 years. Now I have encountered one 
>> problem
>> about the time zone.
>>
>> Because we located at GMT+08:00 timezone, our Hadoop system makes the
>> convention that all the data path on the HDFS is named by the 
>> GMT+08:00
>> timezone. That means:
>> At UTC 2015-01-01T00:00Z, the output hourly data located under this
>> folder: $root/2015010108, not the $root/2015010100
>> At UTC 2015-01-01T01:00Z, the output hourly data located under this
>> folder: $root/2015010109, not the $root/2015010101
>>
>> So if I set the timezone in the coord to UTC, the oozie job will read 
>> the
>> data of 00 hour, but I want it to read the 08. For me in Beijing, 
>> China, it
>> is natural for me to understand that the oozie job will read the 08 
>> data at
>> local 08:00
>>
>> I also tried to set the timezone to GMT+08:00, it didn’t work. 
>> Seems the
>> timezone only impact the “Daylight Saving Time”.
>>
>> Currently I add 8 to my instance number in the coord to fix it 
>> temporarily
>> : Change From <instance>0</instance> to <instance>8</instance>
>> This may be acceptable for hourly job. But it is really ugly to 
>> minutes
>> jobs or dailyl jobs. Almost unreadable for human.
>>
>> So how can I solve this problem?
>>
>> Thanks,
>> Jian
>>

Hi,

the timezone spec in the coordinator node only serves to figure out 
wether
there are 23, 24 or 25 hours on a given day (DST switches); the 
timezones
calculations and anything related to time offsets is done in the 
datasets
sections; try something like:

<coordinator-app xmlns="uri:oozie:coordinator:0.1" timezone="UTC"
     name="${appName}"
     frequency="${coord:hours(1)}"
     start="${startTime}"
     end="${endTime}"
    >
...
     <datasets>
         <dataset
             name="hourly-partition"
             frequency="${coord:hours(1)}"
             initial-instance="${startTime}"
             timezone="Asia/Shanghai">
             <uri-template><!--whatever path 
-->/yyyymmddhh=${YEAR}${MONTH}${DAY}${HOUR}</uri-template>
         </dataset>
     </datasets>

     <input-events>
         <data-in name="in" dataset="hourly-partition">
             <instance>${coord:current(coord:tzOffset()/60)}</instance>
         </data-in>
     </input-events>

David

Mime
View raw message