falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajay Yadava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-36) Ability to ingest data from databases
Date Wed, 22 Jul 2015 17:45:05 GMT

    [ https://issues.apache.org/jira/browse/FALCON-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637270#comment-14637270
] 

Ajay Yadava commented on FALCON-36:
-----------------------------------

[~me.venkatr]

Can you please update the review board with the patch? It adds context to the review comments.
This is great stuff and well thought. Some nits and comments.

* field include and field exclude are of same type so type can be reused.
* Option to select all columns, without typing each one of them? make includes optional, lack
of it should be treated as *
* Consider a metadata feed which needs to be available in all clusters? Will the user need
to write it in all clusters?
* Should we rename the database.xsd as datasource.xsd? (the target namespace is datasource:0.1)
* description can be made a tag instead of attribute, this will allow users to put detailed
comments.
* documentation in tags column is incorrect.
* database.xml doesn't provide example of driver.
* Can you please put more details on how drivers value will be used by falcon? 
* mysql_database.xml is not a valid xml as per the xsd. you have mixed the database and datasource
in tag names. Please use datasource consistently.
* Type is required but it's values are not enforced.  Why do we need it? Can we leave it out
until we add a new type of datasource.
* What is the purpose of version and how will it be used?





> Ability to ingest data from databases
> -------------------------------------
>
>                 Key: FALCON-36
>                 URL: https://issues.apache.org/jira/browse/FALCON-36
>             Project: Falcon
>          Issue Type: Improvement
>          Components: acquisition
>    Affects Versions: 0.3
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkat Ramachandran
>         Attachments: FALCON-36.patch, FALCON-36.rebase.patch, FALCON-36.review.patch,
Falcon Data Ingestion - Proposal.docx, falcon-36.xsd.patch.1
>
>
> Attempt to address data import from RDBMS into hadoop and export of data from Hadoop
into RDBMS. The plan is to use sqoop 1.x to materialize data motion from/to RDBMS to/from
HDFS. Hive will not be integrated in the first pass until Falcon has a first class integration
with HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message