falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Murali Ramasami (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FALCON-2049) Feed Replication with Empty Directories are failing
Date Fri, 24 Jun 2016 05:49:16 GMT
Murali Ramasami created FALCON-2049:
---------------------------------------

             Summary: Feed Replication with Empty Directories are failing
                 Key: FALCON-2049
                 URL: https://issues.apache.org/jira/browse/FALCON-2049
             Project: Falcon
          Issue Type: Bug
          Components: feed
            Reporter: Murali Ramasami
            Priority: Critical


Feed Replication with empty directories are failing with the following error in application
log:

{noformat}
2016-06-23 08:35:21,475 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler:
Moved tmp to done: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml_tmp
to hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml
2016-06-23 08:35:21,476 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler:
Moved tmp to done: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist_tmp
to hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist
2016-06-23 08:35:21,477 INFO [Thread-66] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler:
Stopped JobHistoryEventHandler. super.stop()
2016-06-23 08:35:21,479 INFO [Thread-66] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator:
Setting job diagnostics to No of maps and reduces are 0 job_1466658266370_0059
Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/tmp/falcon-regression/FeedReplicationTest/target/2016/06/23/08/32
doesn't exist
        at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:241)
        at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


{noformat}

Feed submitted:
{noformat}
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1" name="A7769e4e0-49663d60"
description="Input File">
    <partitions>
        <partition name="colo"/>
        <partition name="eventTime"/>
        <partition name="impressionHour"/>
        <partition name="pricingModel"/>
    </partitions>
    <availabilityFlag>availabilityFlag.txt</availabilityFlag>
    <frequency>minutes(5)</frequency>
    <late-arrival cut-off="days(100000)"/>
    <clusters>
        <cluster name="A7769e4e0-0af6c74b" type="source">
            <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/>
            <retention limit="days(1000000)" action="delete"/>
        </cluster>
        <cluster name="A7769e4e0-25f87f0e" type="target">
            <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/>
            <retention limit="days(1000000)" action="delete"/>
            <locations>
                <location type="data" path="/tmp/falcon-regression/FeedReplicationTest/target/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
            </locations>
        </cluster>
    </clusters>
    <locations>
        <location type="data" path="/tmp/falcon-regression/FeedReplicationTest/source/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/>
        <location type="stats" path="/data/regression/fetlrc/billing/stats"/>
        <location type="meta" path="/data/regression/fetlrc/billing/metadata"/>
    </locations>
    <ACL owner="hrt_qa" group="users" permission="0x755"/>
    <schema location="/databus/streams_local/click_rr/schema/" provider="protobuf"/>
    <properties>
        <property name="field1" value="value1"/>
        <property name="field2" value="value2"/>
        <property name="job.counter" value="true"/>
    </properties>
</feed>
{noformat}

It is failing because of the target directories are not exists to replicate. 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message