Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CE6ED878 for ; Thu, 25 Oct 2012 18:23:36 +0000 (UTC) Received: (qmail 71323 invoked by uid 500); 25 Oct 2012 18:23:31 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 71240 invoked by uid 500); 25 Oct 2012 18:23:31 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 71232 invoked by uid 99); 25 Oct 2012 18:23:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Oct 2012 18:23:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of stevel@hortonworks.com designates 209.85.216.41 as permitted sender) Received: from [209.85.216.41] (HELO mail-qa0-f41.google.com) (209.85.216.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Oct 2012 18:23:24 +0000 Received: by mail-qa0-f41.google.com with SMTP id c4so4311276qae.14 for ; Thu, 25 Oct 2012 11:23:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=X6QmDXoVScGZdf1YcquLX60/rBWARyeBTwqepivl1VU=; b=kclqEyXSgyDbQIieSLXpVXttSxqlYryMxlkf3RousxTaRHqFR9BqVvew+Q27M0L5nB WMFD+5cUdCNbABwA4hfTCYJ5jsJCU2ez6TkcrJZb8XJfVB3Ed1NFkdnNHQ01eRhDE9FS YUdLKALDMbVPQgIoJ9REhc2d7LKONvLdywsAZOzIwHEC2QHgzycHaUPpCnQncKrAriBG XlcVTJLmjN85SJugNLNWqCDb/v3r5WNd0gM6iIYm2ScOArapCYiifnduz4QSrU54Xxu0 2WlmO3jJ2Li6Q2t0oyPvGyvhvCIzmXeXDQurt4dL/j7cGq8PGNVoI8dpptpTGJMXBQSP IYvA== MIME-Version: 1.0 Received: by 10.224.188.76 with SMTP id cz12mr4097283qab.6.1351189383486; Thu, 25 Oct 2012 11:23:03 -0700 (PDT) Received: by 10.49.38.193 with HTTP; Thu, 25 Oct 2012 11:23:03 -0700 (PDT) In-Reply-To: References: Date: Thu, 25 Oct 2012 19:23:03 +0100 Message-ID: Subject: Re: HDFS HA IO Fencing From: Steve Loughran To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf303640cf3a1e7704cce64bc4 X-Gm-Message-State: ALoCoQnGtmgxlDNIOAjjMSuReZPPHL1C1RjkBa4tQECJUtdyhf09mvEQRPiB/kNyASlTn9dYOqqL X-Virus-Checked: Checked by ClamAV on apache.org --20cf303640cf3a1e7704cce64bc4 Content-Type: text/plain; charset=UTF-8 On 25 October 2012 14:08, Todd Lipcon wrote: > Hi Liu, > > Locks are not sufficient, because there is no way to enforce a lock in a > distributed system without unbounded blocking. What you might be referring > to is a lease, but leases are still problematic unless you can put bounds > on the speed with which clocks progress on different machines, _and_ have > strict guarantees on the way each node's scheduler works. With Linux and > Java, the latter is tough. > > on any OS running in any virtual environment, including EC2, time is entirely unpredictable, just to make things worse. On a single machine you can use file locking as the OS will know that the process is dead and closes the file; other programs can attempt to open the same file with exclusive locking -and, by getting the right failures, know that something else has the file, hence the other process is live. Shared NFS storage you need to mount with softlock set precisely to stop file locks lasting until some lease has expired, because the on-host liveness probes detect failure faster and want to react to it. -Steve --20cf303640cf3a1e7704cce64bc4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 25 October 2012 14:08, Todd Lipcon <= todd@cloudera.com> wrote:
Hi Liu,

Locks are not sufficient, because there is no wa= y to enforce a lock in a distributed system without unbounded blocking. Wha= t you might be referring to is a lease, but leases are still problematic un= less you can put bounds on the speed with which clocks progress on differen= t machines, _and_ have strict guarantees on the way each node's schedul= er works. With Linux and Java, the latter is tough.


on any OS running in any vi= rtual environment, including EC2, time is entirely unpredictable, just to m= ake things worse.=C2=A0


On a single= machine you can use file locking as the OS will know that the process is d= ead and closes the file; other programs can attempt to open the same file w= ith exclusive locking -and, by getting the right failures, know that someth= ing else has the file, hence the other process is live. Shared NFS storage = you need to mount with softlock set precisely to stop file locks lasting un= til some lease has expired, because the on-host liveness probes detect fail= ure faster and want to react to it.


-Steve
--20cf303640cf3a1e7704cce64bc4--