falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkatesh Seetharam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-310) Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog
Date Mon, 11 Aug 2014 18:54:12 GMT

    [ https://issues.apache.org/jira/browse/FALCON-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093159#comment-14093159
] 

Venkatesh Seetharam commented on FALCON-310:
--------------------------------------------

This should be quite straight forward to create external tables in Hive and point to data
on HDFS. It should work OOTB.

> Allow existing processes to work out-of-box when existing HDFS feeds are configured in
HCatalog
> -----------------------------------------------------------------------------------------------
>
>                 Key: FALCON-310
>                 URL: https://issues.apache.org/jira/browse/FALCON-310
>             Project: Falcon
>          Issue Type: Improvement
>            Reporter: Satish Mittal
>            Assignee: Shwetha G S
>
> After Hcatalog integration, one can configure new falcon feeds based on HCatalog tables
and then write processes that read/write HCat based feeds. However the expectation is that
these processes will be implemented using HCatalog interfaces (HCatInputFormat/HCatOutputFormat
in case of M/R jobs, or HCatLoader/HCatStorer in case of PIG scripts). This is easy for new
processes. 
> However there would be existing processes running in production that are based on HDFS
based feeds and may not get re-written using HCat interfaces. For such processes, one might
just want to configure HCatalog tables around their HDFS feeds and provide a way to allow
existing processes to continue to run as if they are still working with HDFS feeds. 
> Behind the scenes, falcon should be able to find new partitions to read/write, get their
corresponding locations, populate the corresponding workflow variables, register/drop partitions
etc as part of pre/post processing step.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message