Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 136FD200B6F for ; Wed, 20 Jul 2016 08:05:28 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 12290160A76; Wed, 20 Jul 2016 06:05:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3B85E160A8B for ; Wed, 20 Jul 2016 08:05:27 +0200 (CEST) Received: (qmail 35505 invoked by uid 500); 20 Jul 2016 06:05:26 -0000 Mailing-List: contact dev-help@falcon.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.apache.org Delivered-To: mailing list dev@falcon.apache.org Received: (qmail 35401 invoked by uid 99); 20 Jul 2016 06:05:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jul 2016 06:05:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B8982C01A9 for ; Wed, 20 Jul 2016 06:05:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id Q0fWnVWnKtdY for ; Wed, 20 Jul 2016 06:05:22 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id A8AFA5FB50 for ; Wed, 20 Jul 2016 06:05:21 +0000 (UTC) Received: (qmail 33776 invoked by uid 99); 20 Jul 2016 06:05:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jul 2016 06:05:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id B14B02C0D56 for ; Wed, 20 Jul 2016 06:05:20 +0000 (UTC) Date: Wed, 20 Jul 2016 06:05:20 +0000 (UTC) From: "Balu Vellanki (JIRA)" To: dev@falcon.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (FALCON-2090) HDFS Snapshot failed with UnknownHostException when scheduling in HA Mode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 20 Jul 2016 06:05:28 -0000 [ https://issues.apache.org/jira/browse/FALCON-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balu Vellanki updated FALCON-2090: ---------------------------------- Fix Version/s: 0.10 > HDFS Snapshot failed with UnknownHostException when scheduling in HA Mode > ------------------------------------------------------------------------- > > Key: FALCON-2090 > URL: https://issues.apache.org/jira/browse/FALCON-2090 > Project: Falcon > Issue Type: Bug > Components: replication > Affects Versions: trunk > Reporter: Murali Ramasami > Assignee: Balu Vellanki > Priority: Critical > Fix For: trunk > > > In NN HA, when I schedule a hdfs snapshot replication, it is failing with "java.net.UnknownHostException: mycluster1". In the error message primary is the source cluster Nameservice. Please see the complete stack trace. > Stack Trace: > {noformat} > Log Contents: > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in [jar:file:/grid/0/hadoop/yarn/local/filecache/371/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in [jar:file:/grid/0/hadoop/yarn/local/filecache/213/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > Error: java.lang.IllegalArgumentException: java.net.UnknownHostException: mycluster1 > at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411) > at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:429) > at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:207) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2730) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2764) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2746) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:385) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:178) > at org.apache.falcon.hive.util.EventUtils.initializeFS(EventUtils.java:145) > at org.apache.falcon.hive.mapreduce.CopyMapper.setup(CopyMapper.java:47) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) > Caused by: java.net.UnknownHostException: mycluster1 > ... 19 more > {noformat} > Steps to Reproduce: > primaryCluster: > ============ > {noformat} > > > > > > > > > > > > > > > > > > > > > > > > > > {noformat} > falcon entity -submit -type cluster -file primaryCluster.xml --> primaryCluster > backupCluster : > ============ > {noformat} > > > > > > > > > > > > > > > > > > > > > > > > > > {noformat} > falcon entity -submit -type cluster -file backupCluster.xml --> backupCluster > HDFS Snapshot Replication: > ========================= > Source: > ====== > hdfs dfs -mkdir -p /tmp/falcon-regression/HDFSSnapshotTest/source > hdfs dfs -put /grid/0/hadoopqe/tests/ha/falcon/combinedActions/mr_input/2015/01/02/NYSE-2000-2001.tsv /tmp/falcon-regression/HDFSSnapshotTest/source > Create Snapshot : > =============== > hdfs dfsadmin -allowSnapshot /tmp/falcon-regression/HDFSSnapshotTest/source [ hdfs] > hdfs dfs -createSnapshot /tmp/falcon-regression/HDFSSnapshotTest/source [ hrt_qa] > hdfs lsSnapshottableDir [ hrt_qa] > hdfs dfs -ls /tmp/falcon-regression/HDFSSnapshotTest/source/.snapshot > Target: > ====== > hdfs dfs -mkdir -p /tmp/falcon-regression/HDFSSnapshotTest/target > hdfs dfsadmin -allowSnapshot /tmp/falcon-regression/HDFSSnapshotTest/target > hdfs dfs -ls /tmp/falcon-regression/HDFSSnapshotTest/target/.snapshot > hdfs-snapshot.properties > ========================== > {noformat} > jobName=HDFSSnapshotTest > jobClusterName=primaryCluster > jobValidityStart=2016-05-09T06:25Z > jobValidityEnd=2017-05-09T08:00Z > jobFrequency=days(1) > sourceCluster=primaryCluster > sourceSnapshotDir=/tmp/falcon-regression/HDFSSnapshotTest/source > sourceSnapshotRetentionAgeLimit=days(1) > sourceSnapshotRetentionNumber=3 > targetCluster=backupCluster > targetSnapshotDir=/tmp/falcon-regression/HDFSSnapshotTest/target > targetSnapshotRetentionAgeLimit=days(1) > targetSnapshotRetentionNumber=3 > jobAclOwner=hrt_qa > jobAclGroup=users > jobAclPermission="0x755" > {noformat} > falcon extension -extensionName hdfs-snapshot-mirroring -submitAndSchedule -file hdfs-snapshot.properties -- This message was sent by Atlassian JIRA (v6.3.4#6332)