falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-94) Retention to handle hive table eviction
Date Thu, 10 Oct 2013 20:11:43 GMT

    [ https://issues.apache.org/jira/browse/FALCON-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791939#comment-13791939
] 

Venkatesh Seetharam commented on FALCON-94:
-------------------------------------------

Thanks [~sriksun] for taking time to review. My comments are below.

bq. Generic CatalogService (AbstractCatalogService) seems to return HCatPartition, which would
mean that the CatalogService is tied to HCatalog.
You are correct and was lazy to roll my own. Done by introducing a CatalogPartition object.

bq. We are requiring some hcatalog related jars to be copied to oozie shared lib directory.

Good questions. Yes. HCatalog and Pig-hcat adaptors need to be made available.

bq. I am assuming this will be covered in documentation. 
Yes, its mentioned in the docs but covered in detail in Oozie documentation. :-)

bq. Also this would mean that Off the shelf Oozie patched with falcon config wont work any
more. 
The sharelib tar file is created as part of the oozie bundle and is available for users to
upload it to hdfs.

bq. We further require the shared lib dirs to be setup and the contents copied. 
Oozie needs to be setup in any case for DB and setting up with hadoop jars and hcatalog jars
in libext. This will be an additional step for setiing up sharelibs.

bq. Are there any challenges in making these jars available in the retention lib path by default
and not requiring shared libs ?
The only challenge I faced was 10s of jar files for both Hcatalog and its dependencies, same
with Pig and Hive.
HCatalog needs 47 jar files
Pig needs 24 jar files
Hive needs 57 jar files
Not sure if I should create a uber jar and then distribute using Falcon but decided to trump
this and use oozie sharelib.
Makes sense?

bq. Is "feedStorageType" a new property added to oozie coordinator for retention. If so, can
this be prefixed with "falcon.".
Yes. I'll address this for all added properties in a separate jira: FALCON-144

bq. Can ${ & ?{ be defined as constants?
Done.

bq. behaviors for listing partitions and drop partitions are listed there, why do we need
to hardcode the eviction behavior and instance deletion discovery for filesystem need to happen
in FeedEvictor ? 
Table eviction is NOT implemented in CatalogStorage but in FeedEvictor. 

bq. Why can't eviction be implemented in appropriate Storage implementation. That way FeedEvictor
would simpler and lot cleaner. Thoughts ?
Very good thought. Behavior on the storage makes sense. This will apply for replication as
well for import and export which can be a behavior on the storage and will be portable across
workflow engine implementations as well. Opened FALCON-145 to track this.

> Retention to handle hive table eviction
> ---------------------------------------
>
>                 Key: FALCON-94
>                 URL: https://issues.apache.org/jira/browse/FALCON-94
>             Project: Falcon
>          Issue Type: Sub-task
>    Affects Versions: 0.3
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkatesh Seetharam
>         Attachments: FALCON-94.patch, FALCON-94-r1.patch, FALCON-94-r2.patch
>
>
> Must handle both hive managed and external tables.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message