Return-Path: X-Original-To: apmail-falcon-dev-archive@minotaur.apache.org Delivered-To: apmail-falcon-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 32F8410B5A for ; Fri, 30 May 2014 16:07:25 +0000 (UTC) Received: (qmail 43299 invoked by uid 500); 30 May 2014 16:07:25 -0000 Delivered-To: apmail-falcon-dev-archive@falcon.apache.org Received: (qmail 43260 invoked by uid 500); 30 May 2014 16:07:25 -0000 Mailing-List: contact dev-help@falcon.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.incubator.apache.org Delivered-To: mailing list dev@falcon.incubator.apache.org Received: (qmail 43252 invoked by uid 99); 30 May 2014 16:07:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 May 2014 16:07:25 +0000 X-ASF-Spam-Status: No, hits=-2000.7 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 30 May 2014 16:07:23 +0000 Received: (qmail 41951 invoked by uid 99); 30 May 2014 16:07:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 May 2014 16:07:02 +0000 Date: Fri, 30 May 2014 16:07:02 +0000 (UTC) From: "Venkatesh Seetharam (JIRA)" To: dev@falcon.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (FALCON-455) Replication of output feed of an HCatalog process not working MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/FALCON-455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013867#comment-14013867 ] Venkatesh Seetharam commented on FALCON-455: -------------------------------------------- [~satish.mittal], which version of hive are you folks using? > Replication of output feed of an HCatalog process not working > ------------------------------------------------------------- > > Key: FALCON-455 > URL: https://issues.apache.org/jira/browse/FALCON-455 > Project: Falcon > Issue Type: Bug > Affects Versions: 0.5 > Reporter: Satish Mittal > Attachments: hcat-in-feed.xml, hcat-out-feed.xml, hcat-process.xml, workflow.xml > > > Suppose there is an HCatalog process (java type) that takes an HCat input feed and outputs another HCat feed. Further, this output feed is configured for replication across 2 clusters. > The replication of output feed fails during Hive import step. The reason is that HCat process job output on HDFS consists of '_logs' directory if process writes to a static partition (or consists of an empty '_temporary' directory if process writes to a dynamic partition). > The Hive import job logs contain following error: > {noformat} > 9036 [main] INFO org.apache.hadoop.hive.ql.Driver - Starting command: > import table table5 partition (minute='25',month='05',year='2014',hour='12',day='29') from 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data' > 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9036 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9036 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying data from hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25 to hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000 > 9069 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_SUCCESS > 9096 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/_logs > 9190 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-25/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=25/part-r-00000 > 9222 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9580 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9580 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9581 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Loading data to table default.table5 partition (day=29, hour=12, minute=25, month=05, year=2014) from hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000 > 9598 [main] INFO org.apache.hadoop.hive.ql.exec.MoveTask - Partition is: {day=29, hour=12, minute=25, month=05, year=2014} > 9668 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Failed with exception checkPaths: hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000 has nested directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs > org.apache.hadoop.hive.ql.metadata.HiveException: checkPaths: hdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000 has nested directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-mapred/hive_2014-05-29_12-37-37_244_6437156794758917899-1/-ext-10000/_logs > at org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2108) > at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2298) > at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1230) > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:408) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1532) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1305) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1136) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:976) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:966) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359) > at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:457) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:467) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) > at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:318) > at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:279) > at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) > at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > 9668 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - > 9672 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask > {noformat} > Apprarently, Hive import doesn't like any directory in import path. This behavior can be seen on Hive CLI also. > {noformat} > hive> import table table5 partition (minute='32',month='05',year='2014',hour='12',day='29') from 'hdfs://databusdev2.mkhoj.com:9000//projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data' > > ; > Copying data from hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32 > Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_SUCCESS > Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/_logs > Copying file: hdfs://databusdev2.mkhoj.com:9000/projects/falcon/hcolo2/staging/FALCON_FEED_REPLICATION_hcat-out6_hcat-cluster2/default/table5/year=2014/2014-05-29-12-32/hcat-cluster2/data/year=2014/month=05/day=29/hour=12/minute=32/part-r-00000 > Loading data to table default.table5 partition (day=29, hour=12, minute=32, month=05, year=2014) > Failed with exception checkPaths: hdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000 has nested directoryhdfs://databusdev2.mkhoj.com:9000/tmp/hive-hive/hive_2014-05-29_13-13-43_867_8757094482694632648-1/-ext-10000/_logs > FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask > hive> > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)