Return-Path: X-Original-To: apmail-mesos-issues-archive@minotaur.apache.org Delivered-To: apmail-mesos-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 181A0180CA for ; Thu, 14 May 2015 00:18:07 +0000 (UTC) Received: (qmail 99987 invoked by uid 500); 14 May 2015 00:18:06 -0000 Delivered-To: apmail-mesos-issues-archive@mesos.apache.org Received: (qmail 99951 invoked by uid 500); 14 May 2015 00:18:06 -0000 Mailing-List: contact issues-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list issues@mesos.apache.org Received: (qmail 99939 invoked by uid 99); 14 May 2015 00:18:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2015 00:18:06 +0000 Date: Thu, 14 May 2015 00:18:06 +0000 (UTC) From: "Adam B (JIRA)" To: issues@mesos.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MESOS-1554) Persistent resources support for storage-like services MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MESOS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542968#comment-14542968 ] Adam B commented on MESOS-1554: ------------------------------- This Epic/feature is critical for stateful frameworks in Mesos 0.23 and beyond. Upgraded Priority to Critical. > Persistent resources support for storage-like services > ------------------------------------------------------ > > Key: MESOS-1554 > URL: https://issues.apache.org/jira/browse/MESOS-1554 > Project: Mesos > Issue Type: Epic > Components: general, hadoop > Reporter: Nikita Vetoshkin > Priority: Critical > Labels: twitter > > This question came up in [dev mailing list|http://mail-archives.apache.org/mod_mbox/mesos-dev/201406.mbox/%3CCAK8jAgNDs9Fe011Sq1jeNr0h%3DE-tDD9rak6hAsap3PqHx1y%3DKQ%40mail.gmail.com%3E]. > It seems reasonable for storage like services (e.g. HDFS or Cassandra) to use Mesos to manage it's instances. But right now if we'd like to restart instance (e.g. to spin up a new version) - all previous instance version sandbox filesystem resources will be recycled by slave's garbage collector. > At the moment filesystem resources can be managed out of band - i.e. instances can save their data in some database specific placed, that various instances can share (e.g. {{/var/lib/cassandra}}). > [~benjaminhindman] suggested an idea in the mailing list (though it still needs some fleshing out): > {quote} > The idea originally came about because, even today, if we allocate some > file system space to a task/executor, and then that task/executor > terminates, we haven't officially "freed" those file system resources until > after we garbage collect the task/executor sandbox! (We keep the sandbox > around so a user/operator can get the stdout/stderr or anything else left > around from their task/executor.) > To solve this problem we wanted to be able to let a task/executor terminate > but not *give up* all of it's resources, hence: persistent resources. > Pushing this concept even further you could imagine always reallocating > resources to a framework that had already been allocated those resources > for a previous task/executor. Looked at from another perspective, these are > "late-binding", or "lazy", resource reservations. > At one point in time we had considered just doing 'right-of-first-refusal' > for allocations after a task/executor terminate. But this is really > insufficient for supporting storage-like frameworks well (and likely even > harder to reliably implement then 'persistent resources' IMHO). > There are a ton of things that need to get worked out in this model, > including (but not limited to), how should a file system (or disk) be > exposed in order to be made persistent? How should persistent resources be > returned to a master? How many persistent resources can a framework get > allocated? > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)