falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesan Ramachandran" <me.venk...@gmail.com>
Subject Re: Review Request 38465: FALCON-1459 : Ability to import from database
Date Sat, 19 Sep 2015 00:24:58 GMT


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > client/src/main/java/org/apache/falcon/entity/v0/EntityType.java, line 100
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076033#file1076033line100>
> >
> >     Can you please clarify the statement. Feed and process entity can be scheduled
but cluster can't be scheduled.

Originally, isSchedulable() returns false for CLUSTER, but true for PROCESS and FEED etc.
Along the same lines, isSchedulable() returns false for DATASOURCE also in addition to CLUSTER
because DATASOURCE is not schedulable just like CLUSTER.


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > common/src/main/java/org/apache/falcon/util/HdfsClassLoader.java, line 42
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076049#file1076049line42>
> >
> >     In this class, you are trying to fetch the Sqoop jar files and then loading
into JVM to get set to classpath. Can you try any one of the following options:
> >     1. Package required sqoop jar files with Falcon and then use it locally. 
> >     2. Try to use distributed cache for jar files.
> >     
> >     For the above you can explore the class SharedLibraryHostingService in Falcon.

Oozie Sqoop action requires the jdbc jars to be placed in the oozie sqoop sharelib dir (HDFS).

The same JAR is needed on the Falcon JVM to validate the connection end points of the DATASOURCE
entity submitted via REST API. 
So, this class fetches it from HDFS and loads into Falcon JVM during entity submit. 

* Packaging JDBC jars with Falcon will require new build when new datasources are added by
the user. 
* Distributed cache is useful when running map-reduce jobs, but in this case it is used in
the Falcon process.
* I believe SharedLibraryHostingService is used to copy localfs files into HDFS. This one
does the other way around.

Makes sense? Please reopen if this is still an issue.


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > common/src/test/resources/config/datasource/datasource-0.1.xml, line 44
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076057#file1076057line44>
> >
> >     Should not the driver classname and required jar file get handle internally
instead of specifying explictly. To get the required jar file, you can think of using distributed
cache or package required jar file with Falcon to be available locally.

In the case of connecting to multiple relational datasources, Sqoop gets confused which jar
to load - Suppose we have multiple JDBC jars like MySQL and Oracle present, but connecting
to Oracle and Sqoop can very well load MySQL driver first and tries to connect to Oracle database
-- this will cause failure. So, we need to explicitly specify the driver jar and driver manager
class.


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > common/src/test/resources/config/datasource/datasource-invalid-0.1.xml, line 43
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076058#file1076058line43>
> >
> >     Similar comment as above.

See the above comment.


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > oozie/src/main/java/org/apache/falcon/oozie/DatabaseImportWorkflowBuilder.java,
line 86
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076062#file1076062line86>
> >
> >     Sqoop command can be built using command element or arg elements. Is there any
downside of using arg elements instead of command element.

Using args is focused and restricted than using free form. This makes validation easier for
this version. Also, we have tabled the 'filter' feature for later release since it has impact
on other data sources. so, we'll revisit free form when tackling filter.


> On Sept. 18, 2015, 2:55 p.m., Peeyush Bishnoi wrote:
> > webapp/src/test/resources/datasource-template.xml, line 44
> > <https://reviews.apache.org/r/38465/diff/1/?file=1076076#file1076076line44>
> >
> >     See the above comments about handling the driver class name and required jar
file in datasource xml file

Response same as above


- Venkatesan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38465/#review99524
-----------------------------------------------------------


On Sept. 17, 2015, 7:40 p.m., Venkatesan Ramachandran wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38465/
> -----------------------------------------------------------
> 
> (Updated Sept. 17, 2015, 7:40 p.m.)
> 
> 
> Review request for Falcon, Ajay Yadava, Balu Vellanki, Peeyush Bishnoi, Sowmya Ramesh,
and Venkat Ranganathan.
> 
> 
> Repository: falcon-git
> 
> 
> Description
> -------
> 
> FALCON-1459 : Ability to import from database
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/falcon/LifeCycle.java 58a2a6c 
>   client/src/main/java/org/apache/falcon/Tag.java beeb812 
>   client/src/main/java/org/apache/falcon/entity/v0/EntityType.java 0657124 
>   client/src/main/java/org/apache/falcon/metadata/RelationshipType.java f034772 
>   client/src/main/resources/datasource-0.1.xsd PRE-CREATION 
>   client/src/main/resources/feed-0.1.xsd 4ff8baa 
>   client/src/main/resources/jaxb-binding.xjb f644f40 
>   client/src/main/resources/mysql_database.xml PRE-CREATION 
>   common/src/main/java/org/apache/falcon/entity/DatasourceHelper.java PRE-CREATION 
>   common/src/main/java/org/apache/falcon/entity/EntityUtil.java 2f05b1f 
>   common/src/main/java/org/apache/falcon/entity/FeedHelper.java 572923b 
>   common/src/main/java/org/apache/falcon/entity/parser/DatasourceEntityParser.java PRE-CREATION

>   common/src/main/java/org/apache/falcon/entity/parser/EntityParserFactory.java 5a33201

>   common/src/main/java/org/apache/falcon/entity/parser/FeedEntityParser.java 992fc51

>   common/src/main/java/org/apache/falcon/entity/store/ConfigurationStore.java e27187b

>   common/src/main/java/org/apache/falcon/entity/v0/EntityGraph.java bd4c6cf 
>   common/src/main/java/org/apache/falcon/entity/v0/EntityIntegrityChecker.java bd32852

>   common/src/main/java/org/apache/falcon/metadata/EntityRelationshipGraphBuilder.java
8c3876c 
>   common/src/main/java/org/apache/falcon/util/HdfsClassLoader.java PRE-CREATION 
>   common/src/main/java/org/apache/falcon/workflow/WorkflowExecutionContext.java 4454239

>   common/src/test/java/org/apache/falcon/entity/AbstractTestBase.java 6179855 
>   common/src/test/java/org/apache/falcon/entity/EntityTypeTest.java 640e87d 
>   common/src/test/java/org/apache/falcon/entity/FeedHelperTest.java c70cfcc 
>   common/src/test/java/org/apache/falcon/entity/parser/DatasourceEntityParserTest.java
PRE-CREATION 
>   common/src/test/java/org/apache/falcon/entity/parser/FeedEntityParserTest.java d203b7c

>   common/src/test/java/org/apache/falcon/entity/v0/EntityGraphTest.java 3863b11 
>   common/src/test/resources/config/datasource/datasource-0.1.xml PRE-CREATION 
>   common/src/test/resources/config/datasource/datasource-invalid-0.1.xml PRE-CREATION

>   common/src/test/resources/config/feed/feed-import-0.1.xml PRE-CREATION 
>   common/src/test/resources/config/feed/feed-import-invalid-0.1.xml PRE-CREATION 
>   falcon-regression/merlin-core/src/main/java/org/apache/falcon/regression/core/util/HiveAssert.java
2a934b5 
>   oozie/src/main/java/org/apache/falcon/oozie/DatabaseImportWorkflowBuilder.java PRE-CREATION

>   oozie/src/main/java/org/apache/falcon/oozie/FeedImportCoordinatorBuilder.java PRE-CREATION

>   oozie/src/main/java/org/apache/falcon/oozie/ImportWorkflowBuilder.java PRE-CREATION

>   oozie/src/main/java/org/apache/falcon/oozie/OozieCoordinatorBuilder.java 85f5330 
>   oozie/src/main/java/org/apache/falcon/oozie/OozieOrchestrationWorkflowBuilder.java
3213a70 
>   oozie/src/main/java/org/apache/falcon/oozie/feed/FeedBundleBuilder.java b819dee 
>   oozie/src/main/resources/action/feed/import-sqoop-database-action.xml PRE-CREATION

>   pom.xml 646de69 
>   retention/src/test/java/org/apache/falcon/retention/FeedEvictorTest.java 72447da 
>   webapp/pom.xml 828f7f5 
>   webapp/src/test/java/org/apache/falcon/lifecycle/FeedImportIT.java PRE-CREATION 
>   webapp/src/test/java/org/apache/falcon/resource/EntityManagerJerseyIT.java 220e5a7

>   webapp/src/test/java/org/apache/falcon/resource/TestContext.java f031137 
>   webapp/src/test/java/org/apache/falcon/util/HsqldbTestUtils.java PRE-CREATION 
>   webapp/src/test/resources/datasource-template.xml PRE-CREATION 
>   webapp/src/test/resources/feed-template3.xml PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/38465/diff/
> 
> 
> Testing
> -------
> 
> * Unit tests
> * Integration tests
> * Manual tests
>   * Setup MySQL, create table and populate
>   * Create datasource and feed entity with import policy in Falcon  
>   * Made sure the data lands up in the HDFS.
> 
> 
> Thanks,
> 
> Venkatesan Ramachandran
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message