falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karishma Gulati (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (FALCON-623) HCat replication fails on table-export
Date Mon, 25 Aug 2014 07:57:58 GMT

    [ https://issues.apache.org/jira/browse/FALCON-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108863#comment-14108863
] 

Karishma Gulati edited comment on FALCON-623 at 8/25/14 7:57 AM:
-----------------------------------------------------------------

Dont have the original xmls. Re-ran the same test, and got the same error logs. Attaching
the source/target clusters and the feed xml for this test. 

Source cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1" name="corp-6d9f1878"
description="" colo="ua1">
    <interfaces>
        <interface type="readonly" endpoint="hdfs://192.168.138.137:8020" version="0.20.2"/>
        <interface type="write" endpoint="hdfs://192.168.138.137:8020" version="0.20.2"/>
        <interface type="execute" endpoint="192.168.138.137:8021" version="0.20.2"/>
        <interface type="workflow" endpoint="http://192.168.138.137:11000/oozie/" version="3.1"/>
        <interface type="messaging" endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
        <interface type="registry" endpoint="thrift://192.168.138.137:14000" version="0.11.0"/>
    </interfaces>
    <locations>
        <location name="staging" path="/projects/ivory/staging"/>
        <location name="temp" path="/tmp"/>
        <location name="working" path="/projectsTest/ivory/working"/>
    </locations>
    <properties>
        <property name="hive.metastore.client.socket.timeout" value="120"/>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</cluster>
{code}

Target cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1" name="corp-0cd10609"
description="" colo="ua2">
    <interfaces>
        <interface type="readonly" endpoint="hdfs://192.168.138.139:8020" version="0.20.2"/>
        <interface type="write" endpoint="hdfs://192.168.138.139:8020" version="0.20.2"/>
        <interface type="execute" endpoint="192.168.138.139:8021" version="0.20.2"/>
        <interface type="workflow" endpoint="http://192.168.138.139:11000/oozie/" version="3.1"/>
        <interface type="messaging" endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
        <interface type="registry" endpoint="thrift://192.168.138.139:14000" version="0.11.0"/>
    </interfaces>
    <locations>
        <location name="staging" path="/projects/ivory/staging"/>
        <location name="temp" path="/tmp"/>
        <location name="working" path="/projectsTest/ivory/working"/>
    </locations>
    <properties>
        <property name="hive.metastore.client.socket.timeout" value="120"/>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</cluster>
{code}

Feed xml:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1" name="raaw-logs16-69d5f138"
description="clicks log">
    <frequency>hours(1)</frequency>
    <timezone>UTC</timezone>
    <late-arrival cut-off="hours(6)"/>
    <clusters>
        <cluster name="corp-6d9f1878" type="source">
            <validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
            <retention limit="months(9000)" action="delete"/>
        </cluster>
        <cluster name="corp-0cd10609" type="target">
            <validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
            <retention limit="months(9000)" action="delete"/>
            <table uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
        </cluster>
    </clusters>
    <table uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
    <ACL owner="karishma" group="default" permission="0x755"/>
    <schema location="hcat" provider="hcat"/>
    <properties>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</feed>
{code}

Oozie-site.xml -- change in property as asked
{code:xml}
<property>
        <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
        <value>*=hadoop-conf,192.168.138.137:8021=/home/users/dataqa/srcconf,192.168.138.139:8021=/etc/hadoop/conf,192.168.138.137:8020=/home/users/dataqa/srcconf,192.168.138.139:8020=/etc/hadoop/conf</value>
        <description>
            Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
            the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
            used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
            the relevant Hadoop *-site.xml files. If the path is relative is looked within
            the Oozie configuration directory; though the path can be absolute (i.e. to point
            to Hadoop client conf/ directories in the local filesystem.
        </description>
    </property>
{code}

where 8021 is the jt port and 8020 is the nn port.


was (Author: karishmag9):
Dont have the original xmls. Re-ran the test, and got the same error logs. Attaching the source/target
clusters and the feed xml. 

Source cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1" name="corp-6d9f1878"
description="" colo="ua1">
    <interfaces>
        <interface type="readonly" endpoint="hdfs://192.168.138.137:8020" version="0.20.2"/>
        <interface type="write" endpoint="hdfs://192.168.138.137:8020" version="0.20.2"/>
        <interface type="execute" endpoint="192.168.138.137:8021" version="0.20.2"/>
        <interface type="workflow" endpoint="http://192.168.138.137:11000/oozie/" version="3.1"/>
        <interface type="messaging" endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
        <interface type="registry" endpoint="thrift://192.168.138.137:14000" version="0.11.0"/>
    </interfaces>
    <locations>
        <location name="staging" path="/projects/ivory/staging"/>
        <location name="temp" path="/tmp"/>
        <location name="working" path="/projectsTest/ivory/working"/>
    </locations>
    <properties>
        <property name="hive.metastore.client.socket.timeout" value="120"/>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</cluster>
{code}

Target cluster :
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><cluster xmlns="uri:falcon:cluster:0.1" name="corp-0cd10609"
description="" colo="ua2">
    <interfaces>
        <interface type="readonly" endpoint="hdfs://192.168.138.139:8020" version="0.20.2"/>
        <interface type="write" endpoint="hdfs://192.168.138.139:8020" version="0.20.2"/>
        <interface type="execute" endpoint="192.168.138.139:8021" version="0.20.2"/>
        <interface type="workflow" endpoint="http://192.168.138.139:11000/oozie/" version="3.1"/>
        <interface type="messaging" endpoint="tcp://localhost:61617?daemon=true" version="5.1.6"/>
        <interface type="registry" endpoint="thrift://192.168.138.139:14000" version="0.11.0"/>
    </interfaces>
    <locations>
        <location name="staging" path="/projects/ivory/staging"/>
        <location name="temp" path="/tmp"/>
        <location name="working" path="/projectsTest/ivory/working"/>
    </locations>
    <properties>
        <property name="hive.metastore.client.socket.timeout" value="120"/>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</cluster>
{code}

Feed xml:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1" name="raaw-logs16-69d5f138"
description="clicks log">
    <frequency>hours(1)</frequency>
    <timezone>UTC</timezone>
    <late-arrival cut-off="hours(6)"/>
    <clusters>
        <cluster name="corp-6d9f1878" type="source">
            <validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
            <retention limit="months(9000)" action="delete"/>
        </cluster>
        <cluster name="corp-0cd10609" type="target">
            <validity start="2010-01-01T20:00Z" end="2099-01-01T00:00Z"/>
            <retention limit="months(9000)" action="delete"/>
            <table uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
        </cluster>
    </clusters>
    <table uri="catalog:default:HCatReplication_oneSourceOneTarget_hyphen#dt=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
    <ACL owner="karishma" group="default" permission="0x755"/>
    <schema location="hcat" provider="hcat"/>
    <properties>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
    </properties>
</feed>
{code}

Oozie-site.xml -- change in property as asked
{code:xml}
<property>
        <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
        <value>*=hadoop-conf,192.168.138.137:8021=/home/users/dataqa/srcconf,192.168.138.139:8021=/etc/hadoop/conf,192.168.138.137:8020=/home/users/dataqa/srcconf,192.168.138.139:8020=/etc/hadoop/conf</value>
        <description>
            Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
            the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
            used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
            the relevant Hadoop *-site.xml files. If the path is relative is looked within
            the Oozie configuration directory; though the path can be absolute (i.e. to point
            to Hadoop client conf/ directories in the local filesystem.
        </description>
    </property>
{code}

where 8021 is the jt port and 8020 is the nn port.

> HCat replication fails on table-export
> --------------------------------------
>
>                 Key: FALCON-623
>                 URL: https://issues.apache.org/jira/browse/FALCON-623
>             Project: Falcon
>          Issue Type: Bug
>          Components: replication
>         Environment: QA
>            Reporter: Karishma Gulati
>
> On scheduling a one-source, one-target HCat Replication job, table export fails, with
error message: 
> {code}
> JA008: File does not exist: /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-73741e09/1373320570ef25b7d7c1ee474f1f0428_1408529998170/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> {code}
> Oozie track trace: 
> {code}
> 2014-08-20 11:13:01,477 ERROR pool-2-thread-9 UserGroupInformation - SERVER[ip-192-168-138-139]
PriviledgedActionException as:karishma (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.FileNotFoundException:
File does not exist: /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
> 2014-08-20 11:13:01,585  WARN pool-2-thread-9 ActionStartXCommand - SERVER[ip-192-168-138-139]
USER[karishma] GROUP[-] TOKEN[] APP[FALCON_FEED_REPLICATION_raaw-logs16-105f5895] JOB[0000078-140813072435213-oozie-oozi-W]
ACTION[0000078-140813072435213-oozie-oozi-W@table-export] Error starting action [table-export].
ErrorType [ERROR], ErrorCode [JA008], Message [JA008: File does not exist: /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar]
> org.apache.oozie.action.ActionExecutorException: JA008: File does not exist: /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
>         at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
>         at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:396)
>         at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:930)
>         at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1085)
>         at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
>         at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:283)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:352)
>         at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:395)
> at org.apache.oozie.command.wf.SignalXCommand.execute(SignalXCommand.java:73)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:283)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:352)
>         at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:273)
>         at org.apache.oozie.command.wf.ActionEndXCommand.execute(ActionEndXCommand.java:60)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:283)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:352)
>         at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:241)
>         at org.apache.oozie.command.wf.ActionCheckXCommand.execute(ActionCheckXCommand.java:55)
>         at org.apache.oozie.command.XCommand.call(XCommand.java:283)
>         at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:701)
> Caused by: java.io.FileNotFoundException: File does not exist: /projects/ivory/staging/falcon/workflows/feed/raaw-logs16-105f5895/bfed9c56081276857ce86136475fc7da_1408530730861/lib/falcon-client-0.6-incubating-SNAPSHOT.jar
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:824)
>         at org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185)
>         at org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:821)
>         at org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestampsAndCacheVisibilities(TrackerDistributedCacheManager.java:778)
>         at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:852)
>         at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:743)
>         at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:174)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:960)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:919)
>         at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:915)
>         ... 20 more
> {code}
> I set up falcon in distributed mode, using different clusters for source and target.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message