Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 68427 invoked from network); 29 Sep 2006 21:27:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 29 Sep 2006 21:27:25 -0000 Received: (qmail 70301 invoked by uid 500); 29 Sep 2006 21:27:25 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 70276 invoked by uid 500); 29 Sep 2006 21:27:25 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 70267 invoked by uid 99); 29 Sep 2006 21:27:24 -0000 Received: from idunn.apache.osuosl.org (HELO idunn.apache.osuosl.org) (140.211.166.84) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Sep 2006 14:27:24 -0700 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests= Received: from [209.237.227.198] ([209.237.227.198:57470] helo=brutus.apache.org) by idunn.apache.osuosl.org (ecelerity 2.1.1.8 r(12930)) with ESMTP id 03/70-20582-BBF8D154 for ; Fri, 29 Sep 2006 14:27:24 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id C6B907142D8 for ; Fri, 29 Sep 2006 14:27:20 -0700 (PDT) Message-ID: <15956846.1159565240811.JavaMail.root@brutus> Date: Fri, 29 Sep 2006 14:27:20 -0700 (PDT) From: "Sameer Paranjpye (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-563) DFS client should try to re-new lease if it gets a lease expiration exception when it adds a block to a file In-Reply-To: <28125744.1159383470163.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-563?page=comments#action_12438813 ] Sameer Paranjpye commented on HADOOP-563: ----------------------------------------- +1 for the losable (maybe we should call them stale?) leases proposal > DFS client should try to re-new lease if it gets a lease expiration exception when it adds a block to a file > ------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-563 > URL: http://issues.apache.org/jira/browse/HADOOP-563 > Project: Hadoop > Issue Type: Bug > Components: dfs > Reporter: Runping Qi > > In the current DFS client implementation, there is one thread responsible for renewing leases. If for whatever reason, that thread runs behind, the lease may get expired. That causes the client gets a lease expiration exception when writing a block. The consequence of that is very devastating: the client can no longer write to the file, and all the partial results up to that point are gone! This is especially costly for some map reduce jobs where a reducer may take hours or even days to sort the intermediate results before the actual reducing work can start. > The problem will be solved if the flush method of DFS client can renew lease on demand. That is, it should try to re-new lease when it catches a lease expiration exception. That way, even when under heavy load and the lease renewing thread runs behind, the reducer task (or what ever tasks use that client) can preceed. That will be a huge saving in some cases (where sorting intermediate results take a long time to finish). We can set a limit on the number of retries, and may even make it configurable (or changeable at runtime). > The namenode can use a different expiration time that is much higher than the current 1 minute lease expiration time for cleaning up the abandoned unclosed files. > -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira