aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zameer Manji (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AURORA-1494) Scheduler fails to start due to thrift/SQL schema data type mismatch
Date Tue, 10 Nov 2015 18:00:16 GMT

     [ https://issues.apache.org/jira/browse/AURORA-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zameer Manji updated AURORA-1494:
---------------------------------
    Fix Version/s: 0.10.0

> Scheduler fails to start due to thrift/SQL schema data type mismatch
> --------------------------------------------------------------------
>
>                 Key: AURORA-1494
>                 URL: https://issues.apache.org/jira/browse/AURORA-1494
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Maxim Khutornenko
>            Assignee: Maxim Khutornenko
>            Priority: Blocker
>             Fix For: 0.10.0
>
>
> After https://reviews.apache.org/r/38288 we are unable to upgrade scheduler in one of
our clusters due to the following failure on restart:
> {noformat}
> ### Cause: org.h2.jdbc.JdbcSQLException: Numeric value out of range: "3174400000031744";
SQL statement:
> INSERT INTO task_configs (
>       job_key_id,
>       creator_user,
>       service,
>       num_cpus,
>       ram_mb,
>       disk_mb,
>       priority,
>       max_task_failures,
>       production,
>       contact_email,
>       executor_name,
>       executor_data,
>       tier
>     ) VALUES (
>       (
>         SELECT ID
>         FROM job_keys
>         WHERE role = ?
>           AND environment = ?
>           AND name = ?
>       ),
>       ?,
>       ?,
>       ?,
>       ?,
>       ?,
>       ?,
>       ?,
>       ?,
>       ?,
> {noformat}
> This appears due to type mismatch between TaskConfig.diskMb (i64) and task_configs.disk_mb
(INT). 
> A possible real-life scenario:
> - user creates a job with an oversized resource requirement and the job fails to schedule
> - user realizes the mistake and attempts to correct it by running {{aurora update start}}
> - scheduler creates an instance of the JobUpdate with the oversized TaskConfig as its
initial state and persists it in the log
> - scheduler restarts to a new version (with the patch above) and attempts to reload job
updates from the log but now instead of storing TaskConfigs as binary blobs it attempts to
insert into task_configs table where resource columns have narrower type. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message