falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-389) Submit Hcat export workflow to oozie on source cluster rather than to oozie on destination cluster
Date Wed, 30 Apr 2014 17:24:17 GMT

    [ https://issues.apache.org/jira/browse/FALCON-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985774#comment-13985774
] 

Venkatesh Seetharam commented on FALCON-389:
--------------------------------------------

[~shwethags], if you look at replication workflow template, the flow is quite straight forward.
The Table Replication involves the following steps:

* Export data and metadata from Source Hive to source HDFS Staging - This is a hive action
in oozie, but this launches a MR job in the source cluster, which is where this issue crops
up. Had it not been an MR job, things would have been simpler.
* Replicate the staged data from source to target HDFS using distcp
* Import data and metadata into Target Hive from Target HDFS Staging  - This is again an MR
job

Makes sense?

> Submit Hcat export workflow to oozie on source cluster rather than to oozie on destination
cluster
> --------------------------------------------------------------------------------------------------
>
>                 Key: FALCON-389
>                 URL: https://issues.apache.org/jira/browse/FALCON-389
>             Project: Falcon
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Arpit Gupta
>
> Noticed this on hadoop-2 with oozie 4.x that when you run an hcat replication job where
source and destination cluster's are different all jobs are submitted to oozie on the destination
cluster. Then oozie runs an table export job that it submits to RM on cluster 1.
> Now if the oozie server on the target cluster is not running with all hadoop configs
it will not know all the appropriate hadoop configs and yarn job will fail. We saw jobs fail
with errors like
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not found for ApplicationAttempt
appattempt_1395965672651_0010_000002
> on unsecure cluster as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message