falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Srikanth Sundarrajan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-634) Add recipes in Falcon
Date Fri, 12 Sep 2014 06:53:34 GMT

    [ https://issues.apache.org/jira/browse/FALCON-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131186#comment-14131186

Srikanth Sundarrajan commented on FALCON-634:

Objective of the recipe is to solve a standard data management function. To this effect we
should allow recipe to have a RecipeBuilder which will generate one or more processes and
any intermediate feeds if necessary to implement the functionality. A standard vanilla recipe
builder may choose to use template and substitutes parameters in the template to build the
process definitions. Each recipe then is a directory archive (directory name could be trivially
the name of the recipe) which has the implementation of RecipeBuilder and all required support
libraries necessary. Support libraries would again be of two categories, ones required to
build the recipe and the ones required at run time to realize the recipe's objective.

+*Recipe deployment*+
List of recipes can be maintained in the server in a shared location (preferably HDFS) and
the prism server can be configured to point to the correct recipe repository folder

+*Recipe creation*+
Author of a recipe can build the recipe archive and drop this in the location that Prism is
pointing to and the recipe should be available for use immediately. A recipe may be structured
as follows
RecipeRoot =====> Prism server is pointing to this
|-- Recipe1
    |-- README
    |-- META
    |-- libs
        |-- build
        |-- runtime
    |-- resources
        |-- build
        |-- runtime

+*Recipe Listing*+
A GET method may be added to the Prism server on a new Jersey resource to list recipes and
their corresponding root location on recipe repository. Corresponding CLI methods to be present

falcon recipe -list

+*Recipe Description*+
A GET method may be added to the Prism server to echo the README as documentation for the
users. This may contain brief on the functionality offered by the recipe and any operability
notes of importance
falcon recipe -name dr-replication -describe

+*Cooking a recipe*+
falcon recipe -name dr-replication -prepare[AndSchedule] <<location>> <<Variable
arguments as accepted by the RecipeBuilder, which is documented in the respective README>>
This should essentially download all the build and runtime libraries hosted on the path provided
by the listing api, pass the arguments to the recipe builder and build the falcon process
and feed artefacts in the output location or optionally schedule in the live system.

> Add recipes in Falcon
> ---------------------
>                 Key: FALCON-634
>                 URL: https://issues.apache.org/jira/browse/FALCON-634
>             Project: Falcon
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Venkatesh Seetharam
>              Labels: recipes
> Falcon offers many services OOTB and caters to a wide array of use cases. However, there
has been many asks that does not fit the functionality offered by Falcon. I'm proposing that
we add recipes to Falcon which is similar to recipes in Whirr and other management solutions
such as puppet and chef.
> Overview:
> A recipe essentially is a static process template with parameterized workflow to realize
a specific use case. For example:
> * replicating directories from one HDFS cluster to another (not timed partitions)
> * replicating hive metadata (database, table, views, etc.)
> * replicating between HDFS and Hive - either way
> * anonymization of data based on schema
> * data masking
> * etc.
> Proposal:
> Falcon provides a Process abstraction that encapsulates the configuration 
> for a user workflow with scheduling controls. All recipes can be modeled 
> as a Process with in Falcon which executes the user workflow 
> periodically. The process and its associated workflow are parameterized. The user will
provide a properties file with name value pairs that are substituted by falcon before scheduling
> This is a client side concept. The server does not know about a recipe but only accepts
the cooked recipe as a process entity. 
> The CLI would look something like this:
> falcon -recipe $recipe_name -properties $properties_file
> Recipes will reside inside addons (contrib) with source code and will have an option
to package 'em.

This message was sent by Atlassian JIRA

View raw message