falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sowmya Ramesh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-634) Add recipes in Falcon
Date Wed, 14 Jan 2015 03:26:34 GMT

    [ https://issues.apache.org/jira/browse/FALCON-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14276441#comment-14276441

Sowmya Ramesh commented on FALCON-634:

[~sriksun]: Thanks for the suggestions.

I have few queries regarding Recipe creation and deployment.

*Recipe deployment*
List of recipes can be maintained in the server in a shared location (preferably HDFS) and
the prism server can be configured to point to the correct recipe repository folder

About storing recipes on HDFS - this might not be possible if Falcon is running in a setup
where Hadoop is not installed. In this case it can't be deployed on HDFS. I agree that recipes
should be maintained in the server in a shared location locally where the Falcon server is
running and not on HDFS.

Recipe deployment is one time job. If recipes are packaged as part of Falcon then they can
be deployed as part of Falcon install. Let me know if there is a better solution.

Currently if the workflow and user libs are on local FS then RecipeTool copies them on to
HDFS. If the workflow is stored on server and if client, server are running on different machines
then this will mandate the user to copy the WF and user libs to HDFS as Recipe runs on client.

For every recipe user has to pass the path of the property file after updating it with relevant
values or copy it to location on server where recipe is deployed.

Please let me know your thoughts.

> Add recipes in Falcon
> ---------------------
>                 Key: FALCON-634
>                 URL: https://issues.apache.org/jira/browse/FALCON-634
>             Project: Falcon
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Venkatesh Seetharam
>              Labels: recipes
> Falcon offers many services OOTB and caters to a wide array of use cases. However, there
has been many asks that does not fit the functionality offered by Falcon. I'm proposing that
we add recipes to Falcon which is similar to recipes in Whirr and other management solutions
such as puppet and chef.
> Overview:
> A recipe essentially is a static process template with parameterized workflow to realize
a specific use case. For example:
> * replicating directories from one HDFS cluster to another (not timed partitions)
> * replicating hive metadata (database, table, views, etc.)
> * replicating between HDFS and Hive - either way
> * anonymization of data based on schema
> * data masking
> * etc.
> Proposal:
> Falcon provides a Process abstraction that encapsulates the configuration 
> for a user workflow with scheduling controls. All recipes can be modeled 
> as a Process with in Falcon which executes the user workflow 
> periodically. The process and its associated workflow are parameterized. The user will
provide a properties file with name value pairs that are substituted by falcon before scheduling
> This is a client side concept. The server does not know about a recipe but only accepts
the cooked recipe as a process entity. 
> The CLI would look something like this:
> falcon -recipe $recipe_name -properties $properties_file
> Recipes will reside inside addons (contrib) with source code and will have an option
to package 'em.

This message was sent by Atlassian JIRA

View raw message