Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 84225 invoked from network); 21 Jul 2008 06:58:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Jul 2008 06:58:23 -0000 Received: (qmail 50572 invoked by uid 500); 21 Jul 2008 06:58:22 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 50546 invoked by uid 500); 21 Jul 2008 06:58:22 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 50535 invoked by uid 99); 21 Jul 2008 06:58:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Jul 2008 23:58:22 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Jul 2008 06:57:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id D4EF7234C17D for ; Sun, 20 Jul 2008 23:57:31 -0700 (PDT) Message-ID: <1118012280.1216623451871.JavaMail.jira@brutus> Date: Sun, 20 Jul 2008 23:57:31 -0700 (PDT) From: "dhruba borthakur (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876) In-Reply-To: <1443497782.1208162704973.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615158#action_12615158 ] dhruba borthakur commented on HADOOP-3245: ------------------------------------------ +1 to Owen's comment. A pressing need of our cluster is to not interrupt running jobs if the jobtracker has to be restarted. This means that job states have to be persisted in the form of a transaction log. This requirement is all the more beneficial to sites that have long-running job trackers (instead of HOD). However, isn't it better to be able to store state in HDFS? It is true that HDFS stores its transaction log in local files, but with the current focus on improving HDFS read/write latencies, HDFS itself is considering whether to store one copy of the transaction log in HDFS blocks (instead of NFS). In fact, if the JobTracker stores information in a org.apache.hadoop.fs.FileSystem, then a typical customer install could plug in various forms of storage to support the JobTracker transaction log. > Provide ability to persist running jobs (extend HADOOP-1876) > ------------------------------------------------------------ > > Key: HADOOP-3245 > URL: https://issues.apache.org/jira/browse/HADOOP-3245 > Project: Hadoop Core > Issue Type: New Feature > Components: mapred > Reporter: Devaraj Das > Assignee: Amar Kamat > Attachments: HADOOP-3245-v2.5.patch, HADOOP-3245-v2.6.5.patch, HADOOP-3245-v2.6.9.patch, HADOOP-3245-v4.1.patch > > > This could probably extend the work done in HADOOP-1876. This feature can be applied for things like jobs being able to survive jobtracker restarts. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.