falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satish Mittal (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FALCON-310) Allow existing processes to work out-of-box when existing HDFS feeds are configured in HCatalog
Date Fri, 07 Feb 2014 08:23:19 GMT
Satish Mittal created FALCON-310:
------------------------------------

             Summary: Allow existing processes to work out-of-box when existing HDFS feeds
are configured in HCatalog
                 Key: FALCON-310
                 URL: https://issues.apache.org/jira/browse/FALCON-310
             Project: Falcon
          Issue Type: Improvement
            Reporter: Satish Mittal


After Hcatalog integration, one can configure new falcon feeds based on HCatalog tables and
then write processes that read/write HCat based feeds. However the expectation is that these
processes will be implemented using HCatalog interfaces (HCatInputFormat/HCatOutputFormat
in case of M/R jobs, or HCatLoader/HCatStorer in case of PIG scripts). This is easy for new
processes. 

However there would be existing processes running in production that are based on HDFS based
feeds and may not get re-written using HCat interfaces. For such processes, one might just
want to configure HCatalog tables around their HDFS feeds and provide a way to allow existing
processes to continue to run as if they are still working with HDFS feeds. 

Behind the scenes, falcon should be able to find new partitions to read/write, get their corresponding
locations, populate the corresponding workflow variables, register/drop partitions etc as
part of pre/post processing step.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message