airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: [VOTE]: Using fractional seconds
Date Sun, 13 Nov 2016 22:08:50 GMT
I have merged the change. Please note that in case you are adding DateTime fields to the database
you will need to do the following for MySQL:

from alembic import context

if context.config.get_main_option('sqlalchemy.url').startswith('mysql’):
  op.add_column(table_name='dag', column_name='last_scheduler_run', type_=mysql.DATETIME(fsp=6))

The key is the type and setting fsp=6.

- Bolke

> Op 13 nov. 2016, om 20:44 heeft siddharth anand <sanand@apache.org> het volgende
geschreven:
> 
> SGTM
> 
> On Sun, Nov 13, 2016 at 12:02 PM, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
>> Hi All,
>> 
>> I count 3 positive votes, 0 negative ones. Therefore, I will finalize
>> https://github.com/apache/incubator-airflow/pull/1794 which implements
>> Option 1.
>> 
>> Thanks!
>> Bolke
>> 
>>> Op 9 nov. 2016, om 22:48 heeft Arthur Wiedmer <arthur.wiedmer@gmail.com>
>> het volgende geschreven:
>>> 
>>> Hi all,
>>> 
>>> I was the main proponent of option 2, mostly because I could not see a
>>> specific situation where sub second precision was needed for this.
>>> 
>>> However, I feel that we have heard from the community that there are use
>>> cases out there. I agree with Bolke's analysis of the increased
>> operational
>>> cost of maintaining option 2.
>>> 
>>> I vote for option 1.
>>> 
>>> Best regards,
>>> Arthur
>>> 
>>> On Tue, Nov 8, 2016 at 10:40 AM, Maxime Beauchemin <
>>> maximebeauchemin@gmail.com> wrote:
>>> 
>>>> I vote for option 1.
>>>> 
>>>> We may want to alter previous database migration script to have some
>>>> MySQL-specfic, `try` block to get it right on fresh installs.
>>>> 
>>>> We also may want a new database migration that is MysQL-specific and
>> ALTERs
>>>> the columns properly. It seems to me thought that this might require
>> high
>>>> level locks and take some time to execute on large tables (I'm thinking
>>>> `task_instance`). No one likes to see a database migration script hang
>> for
>>>> minutes... An alternate approach might be for someone in the community
>> to
>>>> share a script that does this and that people can review and decide
>> whether
>>>> they want to run it, and perhaps when to run it, maybe after archiving
>> some
>>>> of the large tables in their environment.
>>>> 
>>>> Max
>>>> 
>>>> On Tue, Nov 8, 2016 at 6:39 AM, Vishal Doshi <vishal@celect.com> wrote:
>>>> 
>>>>> We have an (atypical) use case where one DAG launches multiple runs of
>>>>> another DAG (but with different parameters). Without the precision, we
>>>> have
>>>>> to add a second between each launch to avoid the database issues.
>> Moving
>>>>> towards allowing fractional seconds would be great for us.
>>>>> 
>>>>> Thanks,
>>>>> Vishal
>>>>> 
>>>>> On 11/8/16, 04:29, "Bolke de Bruin" <bdbruin@gmail.com> wrote:
>>>>> 
>>>>>   Dear All,
>>>>> 
>>>>>   I’m trying to move over the testing infrastructure to the new
>>>>> infrastructure based on ubuntu 14.04 (we are on 12.04 now). 12.04 uses
>>>>> MySQL 5.5 and 14.04 allows the use of MySQL 5.6, which we say we are
>>>>> compatible with. MySQL does not store fractional seconds. Until version
>>>>> 5.6.4 (https://dev.mysql.com/doc/refman/5.6/en/fractional-seconds.html
>> )
>>>>> it cuts off fractional seconds at comparison time, eg. comparing
>>>>> “2016-01-01 00:00:00.000001” against what is stored in MySQL
>> “2016-01-01
>>>>> 00:00:00” would return a tuple in 5.6.4 but will fail beyond 5.6.4.
The
>>>>> issue presents itself if you use the “@once” schedule interval.
>>>>> 
>>>>>   Other databases (Postgres, SQLite, etc) store fractional seconds by
>>>>> default so do not exhibit this error. Since MySQL 5.6.4 it can also
>> store
>>>>> fractional seconds, but for backwards compatibility it needs to be
>>>>> specified in the schema. Also note that MySQL behavior (not storing
>>>>> fractional seconds) goes against SQL standards as is noted by
>> themselves
>>>> (
>>>>> http://dev.mysql.com/doc/refman/5.7/en/fractional-seconds.html).
>>>>> 
>>>>>   There are two solutions to this issue:
>>>>> 
>>>>>   1. Update the schema for MySQL to include fractional seconds.
>>>>>   PRO:
>>>>>   - no coding changes
>>>>>   - makes mysql behave conform standards
>>>>>   - easier to maintain
>>>>>   - future proof
>>>>> 
>>>>>   CON:
>>>>>   - needs to maintain schema
>>>>>   - requires an update to the schema of running mysql instances
>>>>> 
>>>>>   2. Change the code to remove fractional settings (particularly
>> .now()
>>>>> invocations)
>>>>>   PRO:
>>>>>   - No impact on running MySQL instances
>>>>> 
>>>>>   CON:
>>>>>   - Impact on other databases that now loose precision, and might for
>> a
>>>>> brief time show different behavior
>>>>>   - Code to maintain, cannot use .now() directly
>>>>>   - Be very careful when using date time and accessing the DB
>>>>> 
>>>>> 
>>>>>   There was some back and forth discussion on bitter about this, but
>> we
>>>>> don’t seem to reach a conclusion. Hence I would like to call for a
>> vote -
>>>>> at this election day :). Of course with arguments if needed. If there
>> is
>>>> a
>>>>> better way I’m of course open to that.
>>>>> 
>>>>> 
>>>>>   I vote for OPTION 1.
>>>>> 
>>>>>   Bolke
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>> 
>> 


Mime
View raw message