Mailing-List: contact issues-help@aurora.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@aurora.apache.org
Date: Fri, 2 Dec 2016 23:53:58 +0000 (UTC)
From: "Santhosh Kumar Shanmugham (JIRA)" <jira@apache.org>
To: issues@aurora.apache.org
Message-ID: <JIRA.13025256.1480722832000.428625.1480722838498@Atlassian.JIRA>
In-Reply-To: <JIRA.13025256.1480722832000@Atlassian.JIRA>
References: <JIRA.13025256.1480722832000@Atlassian.JIRA> <JIRA.13025256.1480722832487@arcas>
Subject: [jira] [Created] (AURORA-1844) Force a snapshot at the end of
 startup.
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 02 Dec 2016 23:54:00 -0000

Santhosh Kumar Shanmugham created AURORA-1844:
-------------------------------------------------

             Summary: Force a snapshot at the end of startup.
                 Key: AURORA-1844
                 URL: https://issues.apache.org/jira/browse/AURORA-1844
             Project: Aurora
          Issue Type: Task
            Reporter: Santhosh Kumar Shanmugham
            Priority: Minor


When the scheduler starts up, it replays the logs from the replicated log to catch up with the current state, before announcing itself as the leader to the outside world. If for any reason after this replay, the scheduler dies after adding more log entires, the next startup will have to redo the work again. This becomes problem when the amount of additional work added is not trivial, and can take the scheduler down the path of a spiraling death. One example, of this is when the TaskHistoryPruner cleans up the DB but adds to the log entires. In order to avoid the repeated work, the scheduler should force a snapshot after the initial replay.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)