falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venkatesan Ramachandran <vramachand...@hortonworks.com>
Subject Re: MySQL datasource not importing
Date Sat, 12 Mar 2016 01:40:56 GMT
Hi Andrew,

Please try changing the HDFS path template to have a nominal time (i.e. data date) like below
(up to minute since your feed frequency is in minutes granularity)
/users/falcon/movielens/genres/${YEAR}-${MONTH}-${DAY}-${HOUR}-${MINUTE}

Also, double check that you use mysql-connector-java-5.1.31 connector in the oozie share lib.
Usually, it will be in the HDFS dir /user/oozie/share/lib/lib_<timestamp>/sqoop/mysql-connector-java-5.1.31.

If you still have issues, open the oozie workflow instance and click on the YARN application
and check the map task log. You should see Sqoop output there.


Thanks
Venky



On 3/11/16, 7:44 AM, "Andrew O'Brien" <obrien.andrew@gmail.com> wrote:

>Hi everyone,
>
>I tried out Falcon 0.6 awhile ago and it didn't quite suit my needs. But
>when I saw the new datasource functionality, I decided to give it another
>go. Things looked fairly promising, but I'm not able to actually get it to
>kick off the import from MySQL
>
>For reference, I'm running HDP Sandbox 2.3 with Falcon 0.9 built from the
>last release. I have the movielens dataset loaded into the MySQL database
>running inside the sandbox.
>
>I started with my cluster definition:
>
><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
><cluster name="Sandbox" description="Sandbox running on my local machine"
>colo="local" xmlns="uri:falcon:cluster:0.1">
>    <interfaces>
>        <interface type="readonly" endpoint="hftp://
>sandbox.hortonworks.com:50070" version="2.2.0"/>
>        <interface type="write" endpoint="hdfs://
>sandbox.hortonworks.com:8020" version="2.2.0"/>
>        <interface type="execute" endpoint="sandbox.hortonworks.com:8050"
>version="2.2.0"/>
>        <interface type="workflow" endpoint="
>http://sandbox.hortonworks.com:11000/oozie/" version="4.0.0"/>
>        <interface type="messaging" endpoint="tcp://
>sandbox.hortonworks.com:61616?daemon=true" version="5.1.6"/>
>    </interfaces>
>    <locations>
>        <location name="staging" path="/user/falcon/staging"/>
>        <location name="temp" path="/user/falcon/temp"/>
>        <location name="working" path="/user/falcon/working"/>
>    </locations>
>    <ACL owner="falcon" group="users" permission="0x755"/>
></cluster>
>
>And got it to accept this datasource:
>
><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
><datasource name="movielens-sandbox-mysql" colo="sandbox"
>description="Movielens on sandbox" type="mysql"
>xmlns="uri:falcon:datasource:0.1">
>    <interfaces>
>        <interface type="readonly"
>endpoint="jdbc:mysql://localhost:3306/movielens"/>
>        <credential type="password-text">
>            <userName>root</userName>
>            <passwordText></passwordText>
>        </credential>
>    </interfaces>
>    <driver>
>        <clazz>com.mysql.jdbc.Driver</clazz>
>        <jar>/user/oozie/share/lib/sqoop/mysql-connector-java.jar</jar>
>    </driver>
>    <ACL owner="falcon" group="users" permission="0755"/>
></datasource>
>
>(I confirmed the JDBC url by using it with `sqoop eval`.)
>
>And then declared this feed:
>
><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
><feed name="movielens-genres" description="Movielens genres"
>xmlns="uri:falcon:feed:0.1">
>    <frequency>minutes(5)</frequency>
>    <timezone>UTC</timezone>
>    <clusters>
>        <cluster name="Sandbox">
>            <validity start="2012-07-20T03:00Z" end="2099-07-16T00:00Z"/>
>            <retention limit="months(3)" action="delete"/>
>            <import>
>                <source name="movielens-sandbox-mysql" tableName="genres">
>                    <extract type="full">
>                        <mergepolicy>snapshot</mergepolicy>
>                    </extract>
>                </source>
>            </import>
>        </cluster>
>    </clusters>
>    <locations>
>        <location type="data" path="/users/falcon/movielens/genres"/>
>    </locations>
>    <ACL owner="falcon" group="users" permission="0755"/>
>    <schema location="/user/falcon/schemas/genre.avsc" provider="avro"/>
></feed>
>
>I scheduled it following the instructions here:
>http://falcon.apache.org/site/0.9/ImportExport.html Since then, I've tried
>to rerun it with `falcon entity -touch -type feed -name movielens-genres`.
>
>This should be enough to case a file to appear in
>/user/falcon/movielens/genres, right?
>
>I see jobs in the Oozie console and I see applications in the YARN web UI.
>I've searched the log output for that path or any errors or warnings. I
>turned on the MySQL general-log and didn't see any queries hitting the
>`movielens.genres` tables. Anything else I can try?
>
>Thanks,
>Andrew
Mime
View raw message