Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E7C3FFD7 for ; Mon, 1 Apr 2013 05:13:14 +0000 (UTC) Received: (qmail 28004 invoked by uid 500); 1 Apr 2013 05:13:12 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 27927 invoked by uid 500); 1 Apr 2013 05:13:12 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 27912 invoked by uid 99); 1 Apr 2013 05:13:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 05:13:12 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.223.172 as permitted sender) Received: from [209.85.223.172] (HELO mail-ie0-f172.google.com) (209.85.223.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 05:13:07 +0000 Received: by mail-ie0-f172.google.com with SMTP id c10so1998402ieb.17 for ; Sun, 31 Mar 2013 22:12:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=+iHV8MyXZ6uEMoZdzIQycohmFt+4w1UsK+icQqWEcLM=; b=cticsdaG4Z2+yA6kgBEEev+Z0VVpPtv/vbb1Ddg3Q12DNn8AO6KFjm8ME1GbaRp+oS 2J3ehVYE/8Idz3WzBNsp5pMCJYfxzU/UFJAp8sq2Eit5RxAZO9L0aPdEMeFbBS/hITUm z1+mcuXXu/HmiwZC5NO6nmnQDrX9dyCt7qjwFozkmVoMhe3RzUvgR1cwt6SZtsRhe//+ MiNK2WeoClWMmCOQw5/6ICOUdq1sBkqgL5d+UcsVlTClqk+x0S6UsjucAdftg7IwkOj4 Sc0rAfmxgqpgGUcPuErn15f6aIJEPqLueK2a0EEKjoRjKl8JM/yczfH8Sp02UA/3K6Z/ qpVQ== X-Received: by 10.50.171.73 with SMTP id as9mr2980477igc.23.1364793167384; Sun, 31 Mar 2013 22:12:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.135.37 with HTTP; Sun, 31 Mar 2013 22:12:27 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Mon, 1 Apr 2013 10:42:27 +0530 Message-ID: Subject: Re: Exception with QJM HDFS HA To: hdfs dev Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkQ4gEPMrYXjiROU3/OASPDz42mGw3Bvs81pNgywdcZUPnM+TvcKdpJ4aUn+hgDuobV5vC2 X-Virus-Checked: Checked by ClamAV on apache.org A JIRA was posted by Azuryy for this at https://issues.apache.org/jira/browse/HDFS-4654. On Mon, Apr 1, 2013 at 10:40 AM, Todd Lipcon wrote: > This looks like a bug with the new inode ID code in trunk, rather than a > bug with QJM or HA. > > Suresh/Brandon, any thoughts? > > -Todd > > On Sun, Mar 31, 2013 at 6:43 PM, Azuryy Yu wrote: > >> Hi All, >> >> I configured HDFS Ha using source code from trunk r1463074. >> >> I got an exception as follows when I put a file to the HDFS. >> >> 13/04/01 09:33:45 WARN retry.RetryInvocationHandler: Exception while >> invoking addBlock of class ClientNamenodeProtocolTranslatorPB. Trying to >> fail over immediately. >> 13/04/01 09:33:45 WARN hdfs.DFSClient: DataStreamer Exception >> java.io.FileNotFoundException: ID mismatch. Request id and saved id: 1073 , >> 1050 >> at >> org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:51) >> at >> >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2501) >> at >> >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2298) >> at >> >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2212) >> at >> >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:498) >> at >> >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:356) >> at >> >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40979) >> at >> >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:526) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1018) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1818) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1814) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1489) >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1812) >> >> >> please reproduce as : >> >> hdfs dfs -put test.data /user/data/test.data >> after this command start to run, then kill active name node process. >> >> >> I have only three nodes(A,B,C) for test >> A and B are name nodes. >> B and C are data nodes. >> ZK deployed on A, B and C. >> >> A, B and C are all journal nodes. >> >> Thanks. >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera -- Harsh J