falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil N Kumar <suniln.ku...@impetus.co.in>
Subject RE: Hive Tables as Falcon Feed Input Error
Date Wed, 16 Apr 2014 12:25:48 GMT
Hi Shwetha,
I have created oozie workflow for the Hive using Hive action it is working fine. Oozie Workflow
sample:
<workflow-app xmlns="uri:oozie:workflow:0.1" name="hivesample">
  <start to="hivesample"/>

  <action name="hivesample">
      <hive xmlns="uri:oozie:hive-action:0.2">
               <job-tracker>${jobTracker}</job-tracker>
                        <name-node>${nameNode}</name-node>

           <job-xml>${nameNode}/falcon/sample/hive-site.xml</job-xml>
        <script>${nameNode}/falcon/sample/script.hql</script>
        </hive>

    <ok to="end"/>
        <error to="fail"/>
          </action>

  <kill name="fail">
      <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
        </kill>
          <end name="end"/>
          </workflow-app>

Not sure why it is in waiting state in ooze when I am executing as oozie process.  Find below
Falcon entities details:
Cluster Entity:

<cluster colo="impetus" description="Standalone cluster" name="demoClsHive" xmlns="uri:falcon:cluster:0.1">
    <interfaces>
        <interface type="readonly" endpoint="hftp://Impetus-942.impetus.co.in:50070" version="1.2.1"/>

        <interface type="write" endpoint="hdfs://Impetus-942.impetus.co.in:9000" version="1.2.1"/>

        <interface type="execute" endpoint="Impetus-942.impetus.co.in:9001" version="1.2.1"/>

        <interface type="registry" endpoint="thrift://Impetus-942.impetus.co.in:9083" 
 version="0.11.0" />

        <interface type="workflow" endpoint="http://Impetus-942.impetus.co.in:11000/oozie/"
version="4.0.0"/>

        <interface type="messaging" endpoint="tcp://Impetus-942.impetus.co.in:61616?daemon=true"
 version="5.4.3"/>
    </interfaces>
    <locations>
        <location name="staging" path="/falcon/staging"/>
        <location name="temp" path="/falcon/temp"/>
        <location name="working" path="/falcon/working"/>
    </locations>
    <properties>
    </properties>
</cluster>

Input Feed:

<feed description="Sample table " name="reservation" xmlns="uri:falcon:feed:0.1">
     <groups>sample,ac</groups>
    <frequency>hours(1)</frequency>
    <timezone>UTC</timezone>
    <clusters>
        <cluster name="demoClsHive">
            <validity start="2014-04-01T00:00Z" end="2014-05-21T00:00Z"/>
            <retention limit="hours(24)" action="delete"/>
        </cluster>
    </clusters>

    <table uri="catalog:default:reservation#gender='Female';gender='Male'" />

    <ACL owner="cloud" group="cloud" permission="0x755"/>
    <schema location="hcat" provider="hcat"/>
</feed>

Output Feed:
<?xml version="1.0" encoding="UTF-8"?>
<feed description="clicks log identity table" name="reservationmale" xmlns="uri:falcon:feed:0.1">
   <groups>sample,ac</groups>
    <frequency>hours(1)</frequency>
    <timezone>UTC</timezone>
    <clusters>
        <cluster name="demoClsHive">
               <validity start="2014-04-01T00:00Z" end="2014-05-21T00:00Z"/>
                <retention limit="hours(24)" action="delete"/>
        </cluster>
    </clusters>

    <table uri="catalog:default:reservationmale#gender='Male'" />


    <ACL owner="cloud" group="cloud" permission="0x755"/>
    <schema location="hcat" provider="hcat"/>
</feed>

Process Definition:
<?xml version="1.0" encoding="UTF-8"?>
<process name="hiveprocess" xmlns="uri:falcon:process:0.1">
    <clusters>
        <cluster name="demoClsHive">
            <validity end="2014-05-22T00:00Z" start="2014-04-15T00:00Z"/>
        </cluster>
    </clusters>

    <parallel>1</parallel>
    <order>FIFO</order>
    <frequency>days(1)</frequency>
    <timezone>UTC</timezone>

    <inputs>
        <input end="today(0,0)" start="today(0,0)" feed="reservation" name="input"/>
    </inputs>

    <outputs>
        <output instance="now(0,0)" feed="reservationmale" name="output"/>
    </outputs>

    <properties>
           <property name="oozie.hive.defaults" value="/falcon/sample/hive-site.xml"/>
            <property name="oozie.libpath" value="/falcon/oozie/share/lib/hive/"/>
            <property name="queueName" value="default"/>

    </properties>

    <workflow engine="hive" path="/falcon/hive_run/script.hql"/>

    <retry policy="periodic" delay="minutes(10)" attempts="3"/>

</process>

Please let me know if any thing suspicious there?




Thanks and Regard.
Sunil Kumar



-----Original Message-----
From: Shwetha GS [mailto:shwetha.gs@inmobi.com]
Sent: Wednesday, April 16, 2014 2:56 PM
To: dev@falcon.incubator.apache.org
Subject: Re: Hive Tables as Falcon Feed Input Error

warning about config in /falcon/staging/falcon/ workflows/process/hiveprocess/c3257ea288706fe84d843f6067799d8d/DEFAULT/
is not an issue. Its just a warning from oozie.

Check that the partition(hcat://
Impetus-942.impetus.co.in:9083/default/reservation/gender='Female<http://impetus-942.impetus.co.in:9083/default/reservation/gender='Female>
') exists in HCat or if there is an issue with oozie connecting to Hcat(exception stacktace
in oozie logs).

There are some examples in https://issues.apache.org/jira/browse/FALCON-392.
You can apply the patch and use them


On Tue, Apr 15, 2014 at 7:14 PM, Sunil N Kumar
<suniln.kumar@impetus.co.in>wrote:

> Hi Srikanth,
>
> There is no coord-config-default.xml in the hdfs path
> /falcon/staging/falcon/workflows/process/hiveprocess/c3257ea288706fe84
> d843f6067799d8d/DEFAULT/
> . There is
> /falcon/staging/falcon/workflows/process/hiveprocess/coordinator.xml
> file in the path. How does coord-config-default.xml  file get created
> and why workflow is referring to this file.
>
> Bundle Job LOG Error:
>
> 2014-04-16 00:04:25,018  INFO CoordSubmitXCommand:539 - USER[-]
> GROUP[-] TOKEN[-] APP[-] JOB[0000000-140416000225157-oozie-clou-B]
> ACTION[-] configDefault Doesn't exist hdfs://
> Impetus-942.impetus.co.in:9000/falcon/staging/falcon/workflows/process
> /hiveprocess/c3257ea288706fe84d843f6067799d8d/DEFAULT/coord-config-def
> ault.xml
>
>
>
> -----Original Message-----
> From: Sunil N Kumar
> Sent: Tuesday, April 15, 2014 6:23 PM
> To: dev@falcon.incubator.apache.org
> Subject: RE: Hive Tables as Falcon Feed Input Error
>
> There is no error following are the warning message coming in
> oozie.log
>
> 2014-04-15 23:35:48,554  INFO CoordPushDependencyCheckXCommand:539 -
> USER[-] GROUP[-] TOKEN[-] APP[-]
> JOB[0000001-140415003049552-oozie-clou-C]
> ACTION[0000001-140415003049552-oozie-clou-C@1] First Push missing
> dependency is [hcat://
> Impetus-942.impetus.co.in:9083/default/reservation/gender='Female']
> 2014-04-15 23:35:48,573  INFO HCatURIHandler:539 - USER[-] GROUP[-]
> TOKEN[-] APP[-] JOB[0000001-140415003049552-oozie-clou-C]
> ACTION[0000001-140415003049552-oozie-clou-C@1] Creating HCatClient for
> user [cloud] login_user [cloud] and server [thrift://
> Impetus-942.impetus.co.in:9083]
> 2014-04-15 23:35:48,597  WARN HiveConf:1003 - DEPRECATED:
> Configuration property hive.metastore.local no longer has any effect.
> Make sure to provide a valid value for hive.metastore.uris if you are
> connecting to a remote metastore.
> 2014-04-15 23:35:51,410  INFO CoordPushDependencyCheckXCommand:539 -
> USER[-] GROUP[-] TOKEN[-] APP[-]
> JOB[0000003-140415003049552-oozie-clou-C]
> ACTION[0000003-140415003049552-oozie-clou-C@12] First Push missing
> dependency is [hcat://
> Impetus-942.impetus.co.in:9083/default/reservation/gender='Female']
>
>
>
> -----Original Message-----
> From: Srikanth Sundarrajan [mailto:sriksun@hotmail.com]
> Sent: Tuesday, April 15, 2014 6:05 PM
> To: dev@falcon.incubator.apache.org
> Subject: RE: Hive Tables as Falcon Feed Input Error
>
> I would suspect that Oozie has issues connecting to the hive metastore
> while determining availability of instance. Some logs might help nail
> down the issue.
> RegardsSrikanth Sundarrajan
>
> > From: suniln.kumar@impetus.co.in
> > To: dev@falcon.incubator.apache.org
> > Subject: RE: Hive Tables as Falcon Feed Input Error
> > Date: Tue, 15 Apr 2014 12:29:29 +0000
> >
> > Hi Srikanth,
> > It is still in waiting state at oozie.
> >
> >
> > Thanks and Regard.
> > Sunil Kumar
> >
> >
> > -----Original Message-----
> > From: Srikanth Sundarrajan [mailto:sriksun@hotmail.com]
> > Sent: Tuesday, April 15, 2014 5:53 PM
> > To: dev@falcon.incubator.apache.org
> > Subject: RE: Hive Tables as Falcon Feed Input Error
> >
> > Hi Sunil,    Can you confirm if the process has kicked off in hadoop or
> is it still in PREP / WAITING state in oozie ?
> > RegardsSrikanth Sundarrajan
> >
> > > From: suniln.kumar@impetus.co.in
> > > To: dev@falcon.incubator.apache.org
> > > Subject: RE: Hive Tables as Falcon Feed Input Error
> > > Date: Tue, 15 Apr 2014 06:37:01 +0000
> > >
> > > Hi,
> > > I am able to resolve Feed definition issue for the Hive tables. It
> > > is
> mandatory to have Hive partition for you hive tables and it also needs
> to build Oozie with the Hive and Hcatalog jars. Please refer to below
> blog for building oozie with Hive Dependencies:
> > > http://blog.nanthrax.net/2014/03/hadoop-cdc-and-processes-notifica
> > > ti on -with-apache-falcon-apache-activemq-and-apache-camel/
> > >
> > >
> > > But now I am facing issue with the Process execution of falcon for
> > > the
> Hive tables. It remain running state from last 18 hours . Not sure
> what is the issue . can somebody help me to resolve this.
> > >
> > >  Following is the process definition of the Hive process I have
> created :
> > > <?xml version="1.0" encoding="UTF-8"?> <process name="hiveprocess2"
> > > xmlns="uri:falcon:process:0.1">
> > >     <clusters>
> > >         <cluster name="demoClsHive1">
> > >             <validity end="2014-04-22T00:00Z"
> start="2014-04-01T00:00Z"/>
> > >         </cluster>
> > >     </clusters>
> > >
> > >     <parallel>1</parallel>
> > >     <order>FIFO</order>
> > >     <frequency>days(1)</frequency>
> > >     <timezone>UTC</timezone>
> > >
> > >     <inputs>
> > >         <input end="today(0,0)" start="today(0,0)" feed="reservation2"
> name="input"/>
> > >     </inputs>
> > >
> > >     <outputs>
> > >         <output instance="now(0,0)" feed="reservationmale2"
> name="output"/>
> > >     </outputs>
> > >
> > >     <properties>
> > >         <property name="queueName" value="default"/>
> > >     </properties>
> > >
> > >     <workflow engine="hive" path="/falcon/hive_run/script.hql"/>
> > >
> > >     <retry policy="periodic" delay="minutes(10)" attempts="3"/>
> > >
> > >     <late-process policy="exp-backoff" delay="hours(2)">
> > >         <late-input input="input" workflow-path="/falcon/hive_run/"/>
> > >     </late-process>
> > > </process>
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Sunil N Kumar
> > > Sent: Monday, April 14, 2014 12:58 PM
> > > To: dev@falcon.incubator.apache.org
> > > Subject: Hive Tables as Falcon Feed Input Error
> > >
> > > Hi,
> > >
> > > I am trying to build sample around executing Hive Script as part
> > > of
> Falcon Oozie Process. I have created two tables in the Hive and want
> to join data from these tables and put data into third Hive table
> using falcon Process. But I am not able to submit Input feed into
> Falcon Server. I am using Falcon 0.4 release.
> > >
> > >
> > >
> > > Input Feed XML:
> > >
> > > <?xml version="1.0" encoding="UTF-8"?>
> > >
> > > <feed description="clicks log table " name="input-table"
> > > xmlns="uri:falcon:feed:0.1">
> > >
> > >     <frequency>hours(1)</frequency>
> > >
> > >     <timezone>UTC</timezone>
> > >
> > >     <late-arrival cut-off="hours(3)"/>
> > >
> > >
> > >
> > >     <clusters>
> > >
> > >         <cluster name="demoClsHive">
> > >
> > >             <validity start="2010-01-01T00:00Z"
> > > end="2012-04-21T00:00Z"/>
> > >
> > >             <retention limit="hours(24)" action="delete"/>
> > >
> > >         </cluster>
> > >
> > >     </clusters>
> > >
> > >
> > >
> > >     <table uri="catalog:default:reservationnew:gender='Male'" />
> > >
> > >
> > >
> > >     <ACL owner="cloud" group="cloud" permission="0x755"/>
> > >
> > >     <schema location="hcat" provider="hcat"/>
> > >
> > > </feed>
> > >
> > >
> > >
> > > Error:
> > >
> > > [cloud@Impetus-942 falcon-distributed-0.4-incubating]$ bin/falcon
> > > entity -submit -type feed -file examples/entity/in-feed.xml
> > >
> > > Error: java.net.URISyntaxException: URI path is not in expected
> format: database:table: catalog:default:reservationnew:gender='Male'
> > >
> > > ________________________________
> > >
> > >
> > >
> > >
> > >
> > >
> > > NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited
> when received in error. Impetus does not represent, warrant and/or
> guarantee, that the integrity of this communication has been
> maintained nor that the communication is free of errors, virus, interception or interference.
> > >
> > > ________________________________
> > >
> > >
> > >
> > >
> > >
> > >
> > > NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited
> when received in error. Impetus does not represent, warrant and/or
> guarantee, that the integrity of this communication has been
> maintained nor that the communication is free of errors, virus, interception or interference.
> >
> >
> > ________________________________
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited
> when received in error. Impetus does not represent, warrant and/or
> guarantee, that the integrity of this communication has been
> maintained nor that the communication is free of errors, virus, interception or interference.
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited
> when received in error. Impetus does not represent, warrant and/or
> guarantee, that the integrity of this communication has been
> maintained nor that the communication is free of errors, virus, interception or interference.
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited
> when received in error. Impetus does not represent, warrant and/or
> guarantee, that the integrity of this communication has been
> maintained nor that the communication is free of errors, virus, interception or interference.
>

--
_____________________________________________________________
The information contained in this communication is intended solely for the use of the individual
or entity to whom it is addressed and others authorized to receive it. It may contain confidential
or legally privileged information. If you are not the intended recipient you are hereby notified
that any disclosure, copying, distribution or taking any action in reliance on the contents
of this information is strictly prohibited and may be unlawful. If you have received this
communication in error, please notify us immediately by responding to this email and then
delete it from your system. The firm is neither liable for the proper and complete transmission
of the information contained in this communication nor for any delay in its receipt.

________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.
Mime
View raw message