aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maxim Khutornenko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AURORA-1494) Scheduler fails to start due to thrift/SQL schema data type mismatch
Date Fri, 18 Sep 2015 00:24:04 GMT
Maxim Khutornenko created AURORA-1494:
-----------------------------------------

             Summary: Scheduler fails to start due to thrift/SQL schema data type mismatch
                 Key: AURORA-1494
                 URL: https://issues.apache.org/jira/browse/AURORA-1494
             Project: Aurora
          Issue Type: Bug
          Components: Scheduler
            Reporter: Maxim Khutornenko
            Assignee: Maxim Khutornenko
            Priority: Blocker


After https://reviews.apache.org/r/38288 we are unable to upgrade scheduler in one of our
clusters due to the following failure on restart:
{noformat}
### Cause: org.h2.jdbc.JdbcSQLException: Numeric value out of range: "3174400000031744"; SQL
statement:
INSERT INTO task_configs (
      job_key_id,
      creator_user,
      service,
      num_cpus,
      ram_mb,
      disk_mb,
      priority,
      max_task_failures,
      production,
      contact_email,
      executor_name,
      executor_data,
      tier
    ) VALUES (
      (
        SELECT ID
        FROM job_keys
        WHERE role = ?
          AND environment = ?
          AND name = ?
      ),
      ?,
      ?,
      ?,
      ?,
      ?,
      ?,
      ?,
      ?,
      ?,
{noformat}

This appears due to type mismatch between TaskConfig.diskMb (i64) and task_configs.disk_mb
(INT). 

A possible real-life scenario:
- user creates a job with an oversized resource requirement and the job fails to schedule
- user realizes the mistake and attempts to correct it by running {{aurora update start}}
- scheduler creates an instance of the JobUpdate with the oversized TaskConfig as its initial
state and persists it in the log
- scheduler restarts to a new version (with the patch above) and attempts to reload job updates
from the log but now instead of storing TaskConfigs as binary blobs it attempts to insert
into task_configs table where resource columns have narrower type. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message