falcon-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sowmya...@apache.org
Subject falcon git commit: FALCON-1106 Documentation for extensions
Date Tue, 03 May 2016 21:45:46 GMT
Repository: falcon
Updated Branches:
  refs/heads/master fc34d42cb -> 85345ad7e


FALCON-1106 Documentation for extensions

Author: Sowmya Ramesh <sramesh@hortonworks.com>

Reviewers: "Balu Vellanki <balu@apache.org>", Ying Zheng <yzheng@hortonworks.com>"

Closes #120 from sowmyaramesh/FALCON-1106


Project: http://git-wip-us.apache.org/repos/asf/falcon/repo
Commit: http://git-wip-us.apache.org/repos/asf/falcon/commit/85345ad7
Tree: http://git-wip-us.apache.org/repos/asf/falcon/tree/85345ad7
Diff: http://git-wip-us.apache.org/repos/asf/falcon/diff/85345ad7

Branch: refs/heads/master
Commit: 85345ad7e7421fbd25829381f27eb5b165d2f8d0
Parents: fc34d42
Author: Sowmya Ramesh <sowmya_kr@apache.org>
Authored: Tue May 3 14:45:40 2016 -0700
Committer: Sowmya Ramesh <sramesh@hortonworks.com>
Committed: Tue May 3 14:45:40 2016 -0700

----------------------------------------------------------------------
 addons/extensions/hdfs-mirroring/README         |  11 +-
 addons/extensions/hive-mirroring/README         |  43 +----
 addons/hivedr/README                            |  16 +-
 docs/src/site/twiki/EntitySpecification.twiki   |   8 +-
 docs/src/site/twiki/Extensions.twiki            |  55 +++++++
 docs/src/site/twiki/FalconDocumentation.twiki   |   6 +-
 docs/src/site/twiki/HDFSDR.twiki                |  34 ----
 docs/src/site/twiki/HDFSMirroring.twiki         |  27 ++++
 docs/src/site/twiki/HiveDR.twiki                |  80 ----------
 docs/src/site/twiki/HiveMirroring.twiki         |  63 ++++++++
 docs/src/site/twiki/Recipes.twiki               |  85 ----------
 .../site/twiki/falconcli/DefineExtension.twiki  |   8 +
 .../twiki/falconcli/DescribeExtension.twiki     |   8 +
 .../twiki/falconcli/EnumerateExtension.twiki    |   8 +
 docs/src/site/twiki/falconcli/FalconCLI.twiki   |  16 +-
 .../twiki/restapi/ExtensionDefinition.twiki     | 160 +++++++++++++++++++
 .../twiki/restapi/ExtensionDescription.twiki    |  24 +++
 .../twiki/restapi/ExtensionEnumeration.twiki    |  38 +++++
 docs/src/site/twiki/restapi/ResourceList.twiki  |  14 +-
 19 files changed, 442 insertions(+), 262 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/extensions/hdfs-mirroring/README
----------------------------------------------------------------------
diff --git a/addons/extensions/hdfs-mirroring/README b/addons/extensions/hdfs-mirroring/README
index 78f1726..24c2bd4 100644
--- a/addons/extensions/hdfs-mirroring/README
+++ b/addons/extensions/hdfs-mirroring/README
@@ -14,16 +14,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-HDFS Directory Replication Extension
+HDFS Mirroring Extension
 
 Overview
-This extension implements replicating arbitrary directories on HDFS from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
+Falcon supports HDFS mirroring extension to replicate data from source cluster to destination
cluster.
+This extension implements replicating arbitrary directories on HDFS and piggy backs on replication
solution in Falcon which uses the DistCp tool.
+It also allows users to replicate data from on-premise to cloud, either Azure WASB or S3.
+
 
 Use Case
 * Copy directories between HDFS clusters with out dated partitions
 * Archive directories from HDFS to Cloud. Ex: S3, Azure WASB
 
 Limitations
-As the data volume and number of files grow, this can get inefficient.
+* As the data volume and number of files grow, this can get inefficient.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/extensions/hive-mirroring/README
----------------------------------------------------------------------
diff --git a/addons/extensions/hive-mirroring/README b/addons/extensions/hive-mirroring/README
index 827f7e5..04637c0 100644
--- a/addons/extensions/hive-mirroring/README
+++ b/addons/extensions/hive-mirroring/README
@@ -14,45 +14,18 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-Hive Metastore Disaster Recovery Recipe
+Hive Mirroring Extension
 
 Overview
-This extension implements replicating hive metadata and data from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
+Falcon provides feature to replicate Hive metadata and data events from source cluster to
destination cluster.
+This is supported for both secure and unsecure cluster through Falcon extensions. Falcon
uses event­based replication capability provided by hive to implement the Hive mirroring
feature.
+Falcon will act as admin/user­facing tool which will have fine control on what and how to
replicate as defined by its users, while leaving the delta, data and metadata management to
hive itself.
+Hive mirroring extension piggy backs on Distcp tool for replication.
 
 Use Case
-*
-*
+* Replicate data/metadata of Hive DB & table from source to target cluster
 
 Limitations
-*
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-Hive Metastore Disaster Recovery Extension
+* Currently Hive doesn't support create database, roles, views, offline tables, direct HDFS
writes without registering with metadata and Database/Table name mapping replication events.
+Hence Hive mirroring extension cannot be used to replicate above mentioned events between
warehouses.
 
-Overview
-This extension implements replicating hive metadata and data from one
-Hadoop cluster to another Hadoop cluster.
-This piggy backs on replication solution in Falcon which uses the DistCp tool.
-
-Use Case
-*
-*
-
-Limitations
-*

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/addons/hivedr/README
----------------------------------------------------------------------
diff --git a/addons/hivedr/README b/addons/hivedr/README
index 0b448d3..161ed1b 100644
--- a/addons/hivedr/README
+++ b/addons/hivedr/README
@@ -14,20 +14,19 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-Hive Disaster Recovery
+Hive Mirroring
 =======================
 
 Overview
 ---------
 
-Falcon provides feature to replicate Hive metadata and data events from one hadoop cluster
-to another cluster. This is supported for secure and unsecure cluster through Falcon Recipes.
+Falcon provides feature to replicate Hive metadata and data events from source cluster to
destination cluster. This is supported for both secure and unsecure cluster through Falcon
extensions.
 
 
 Prerequisites
 -------------
 
-Following is the prerequisites to use Hive DR
+Following is the prerequisites to use Hive mirrroing
 
 * Hive 1.2.0+
 * Oozie 4.2.0+
@@ -69,12 +68,9 @@ a. Perform initial bootstrap of Table and Database from one Hadoop cluster
to an
 b. Setup cluster definition
    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
 
-c. Submit Hive DR recipe
-   $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation HIVE_DISASTER_RECOVERY
+c. Submit Hive mirroring extension
+   $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hive-mirroring -file
/process/definition.xml
 
+   Please Refer to Falcon CLI and REST API twiki in the Falcon documentation for more details
on usage of CLI and REST API's for extension jobs and instances management.
 
-Recipe templates for Hive DR is available in addons/recipe/hive-disaster-recovery and copy
it to
-recipe path specified in client.properties.
 
-*Note:* If kerberos security is enabled on cluster, use the secure templates for Hive DR
from
-        addons/recipe/hive-disaster-recovery
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/EntitySpecification.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/EntitySpecification.twiki b/docs/src/site/twiki/EntitySpecification.twiki
index 7eedf87..b27e341 100644
--- a/docs/src/site/twiki/EntitySpecification.twiki
+++ b/docs/src/site/twiki/EntitySpecification.twiki
@@ -922,10 +922,10 @@ The workflow is re-tried after 10 mins, 20 mins and 30 mins. With exponential
ba
 
 To enable retries for instances for feeds, user will have to set the following properties
in runtime.properties
 <verbatim>
-falcon.recipe.retry.policy=periodic
-falcon.recipe.retry.delay=minutes(30)
-falcon.recipe.retry.attempts=3
-falcon.recipe.retry.onTimeout=false
+falcon.retry.policy=periodic
+falcon.retry.delay=minutes(30)
+falcon.retry.attempts=3
+falcon.retry.onTimeout=false
 <verbatim>
 ---+++ Late data
 Late data handling defines how the late data should be handled. Each feed is defined with
a late cut-off value which specifies the time till which late data is valid. For example,
late cut-off of hours(6) means that data for nth hour can get delayed by upto 6 hours. Late
data specification in process defines how this late data is handled.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/Extensions.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Extensions.twiki b/docs/src/site/twiki/Extensions.twiki
new file mode 100644
index 0000000..6b4bf11
--- /dev/null
+++ b/docs/src/site/twiki/Extensions.twiki
@@ -0,0 +1,55 @@
+---+ Falcon Extensions
+
+---++ Overview
+
+A Falcon extension is a static process template with parameterized workflow to realize a
specific use case and enable non-programmers to capture and re-use very complex business logic.
Extensions are defined in server space. Objective of the extension is to solve a standard
data management function that can be invoked as a tool using the standard Falcon features
(REST API, CLI and UI access) supporting standard falcon features.
+
+For example:
+
+   * Replicating directories from one HDFS cluster to another (not timed partitions)
+   * Replicating hive metadata (database, table, views, etc.)
+   * Replicating between HDFS and Hive - either way
+   * Data masking etc.
+
+---++ Proposal
+
+Falcon provides a Process abstraction that encapsulates the configuration for a user workflow
with scheduling controls. All extensions can be modeled as a Process and its dependent feeds
with in Falcon which executes the user
+workflow periodically. The process and its associated workflow are parameterized. The user
will provide properties which are <name, value> pairs that are substituted by falcon
before scheduling it. Falcon translates these extensions
+as a process entity by replacing the parameters in the workflow definition.
+
+---++ Falcon extension artifacts to manage extensions
+
+Extension artifacts are published in addons/extensions. Artifacts are expected to be installed
on HDFS at "extension.store.uri" path defined in startup properties. Each extension is expected
to ahve the below artifacts
+   * json file under META directory lists all the required and optional parameters/arguments
for scheduling extension job
+   * process entity template to be scheduled under resources directory
+   * parameterized workflow under resources directory
+   * required libs under the libs directory
+   * README describing the functionality achieved by extension
+
+REST API and CLI support has been added for extension artifact management on HDFS. Please
Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more
details.
+
+---++ CLI and REST API support
+REST APIs and CLI support has been added to manage extension jobs and instances.
+
+Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]]
for more details on usage of CLI and REST API's for extension jobs and instances management.
+
+---++ Metrics
+HDFS mirroring and Hive mirroring extensions will capture the replication metrics like TIMETAKEN,
BYTESCOPIED, COPY (number of files copied) for an instance and populate to the GraphDB.
+
+---++ Sample extensions
+
+Sample extensions are published in addons/extensions
+
+---++ Types of extensions
+   * [[HDFSMirroring][HDFS mirroring extension]]
+   * [[HiveMirroring][Hive mirroring extension]]
+
+---++ Packaging and installation
+
+Extension artifacts in addons/extensions are packaged in falcon war under extensions directory.
For manual installation user is expected to install the extension artifacts under extensions
in falcon war to HDFS at "extension.store.uri" path defined in startup properties and then
restart Falcon.
+
+---++ Migration
+Recipes framework and HDFS mirroring capability was added in Apache Falcon 0.6.0 release
and it was client side logic. With 0.10 release its moved to server side and renamed as server
side extensions. Client side recipes only had CLI support and expected certain pre steps to
get it working. This is no longer required in 0.10 release as new CLI and REST API support
has been provided.
+
+If user is migrating to 0.10 release and above then old Recipe setup and CLI's won't work.
For manual installation user is expected to copy Extension artifacts to HDFS. Please refer
"Packaging and installation" section above for more details.
+Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]]
for more details on usage of CLI and REST API's for extension jobs and instances management.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/FalconDocumentation.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/FalconDocumentation.twiki b/docs/src/site/twiki/FalconDocumentation.twiki
index 2d67070..89370ec 100644
--- a/docs/src/site/twiki/FalconDocumentation.twiki
+++ b/docs/src/site/twiki/FalconDocumentation.twiki
@@ -13,7 +13,7 @@
    * <a href="#Falcon_EL_Expressions">Falcon EL Expressions</a>
    * <a href="#Lineage">Lineage</a>
    * <a href="#Security">Security</a>
-   * <a href="#Recipes">Recipes</a>
+   * <a href="#Extensions">Extensions</a>
    * <a href="#Monitoring">Monitoring</a>
    * <a href="#Email_Notification">Email Notification</a>
    * <a href="#Backwards_Compatibility">Backwards Compatibility Instructions</a>
@@ -738,9 +738,9 @@ lifecycle policies such as replication and retention.
 
 Security is detailed in [[Security][Security]].
 
----++ Recipes
+---++ Extensions
 
-Recipes is detailed in [[Recipes][Recipes]].
+Extensions is detailed in [[Extensions][Extensions]].
 
 ---++ Monitoring
 

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HDFSDR.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HDFSDR.twiki b/docs/src/site/twiki/HDFSDR.twiki
deleted file mode 100644
index 1c1e3f5..0000000
--- a/docs/src/site/twiki/HDFSDR.twiki
+++ /dev/null
@@ -1,34 +0,0 @@
----+ HDFS DR Recipe
----++ Overview
-Falcon supports HDFS DR recipe to replicate data from source cluster to destination cluster.
-
----++ Usage
----+++ Setup cluster definition.
-   <verbatim>
-    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
-   </verbatim>
-
----+++ Update recipes properties
-   Copy HDFS replication recipe properties, workflow and template file from $FALCON_HOME/data-mirroring/hdfs-replication
to the accessible
-   directory path or to the recipe directory path (*falcon.recipe.path=<recipe directory
path>*). *"falcon.recipe.path"* must be specified
-   in Falcon conf client.properties. Now update the copied recipe properties file with required
attributes to replicate data from source cluster to
-   destination cluster for HDFS DR.
-
----+++ Submit HDFS DR recipe
-
-   After updating the recipe properties file with required attributes in directory path or
in falcon.recipe.path,
-   there are two ways of submitting the HDFS DR recipe:
-
-   * 1. Specify Falcon recipe properties file through recipe command line.
-   <verbatim>
-    $FALCON_HOME/bin/falcon recipe -name hdfs-replication -operation HDFS_REPLICATION
-    -properties /cluster/hdfs-replication.properties
-   </verbatim>
-
-   * 2. Use Falcon recipe path specified in Falcon conf client.properties .
-   <verbatim>
-    $FALCON_HOME/bin/falcon recipe -name hdfs-replication -operation HDFS_REPLICATION
-   </verbatim>
-
-
-*Note:* Recipe properties file, workflow file and template file name must match to the recipe
name, it must be unique and in the same directory.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HDFSMirroring.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HDFSMirroring.twiki b/docs/src/site/twiki/HDFSMirroring.twiki
new file mode 100644
index 0000000..a810947
--- /dev/null
+++ b/docs/src/site/twiki/HDFSMirroring.twiki
@@ -0,0 +1,27 @@
+---+ HDFS mirroring Extension
+---++ Overview
+Falcon supports HDFS mirroring extension to replicate data from source cluster to destination
cluster. This extension implements replicating arbitrary directories on HDFS and piggy backs
on replication solution in Falcon which uses the DistCp tool. It also allows users to replicate
data from on-premise to cloud, either Azure WASB or S3.
+
+---++ Use Case
+* Copy directories between HDFS clusters with out dated partitions
+* Archive directories from HDFS to Cloud. Ex: S3, Azure WASB
+
+---++ Limitations
+As the data volume and number of files grow, this can get inefficient.
+
+---++ Usage
+---+++ Setup source and destination clusters
+   <verbatim>
+    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
+   </verbatim>
+
+---+++ HDFS mirroring extension properties
+   Extension artifacts are expected to be installed on HDFS at the path specified by "extension.store.uri"
in startup properties. hdfs-mirroring-properties.json file located at "<extension.store.uri>/hdfs-mirroring/META/hdfs-mirroring-properties.json"
lists all the required and optional parameters/arguments for scheduling HDFS mirroring job.
+
+---+++ Submit and schedule HDFS mirroring extension
+
+   <verbatim>
+    $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hdfs-mirroring -file
/process/definition.xml
+   </verbatim>
+
+   Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]]
for more details on usage of CLI and REST API's.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HiveDR.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HiveDR.twiki b/docs/src/site/twiki/HiveDR.twiki
deleted file mode 100644
index cf35694..0000000
--- a/docs/src/site/twiki/HiveDR.twiki
+++ /dev/null
@@ -1,80 +0,0 @@
----+Hive Disaster Recovery
-
-
----++Overview
-Falcon provides feature to replicate Hive metadata and data events from source cluster
-to destination cluster. This is supported for secure and unsecure cluster through Falcon
Recipes.
-
-
----++Prerequisites
-Following is the prerequisites to use Hive DR
-
-   * *Hive 1.2.0+*
-   * *Oozie 4.2.0+*
-
-*Note:* Set following properties in hive-site.xml for replicating the Hive events on source
and destination Hive cluster:
-<verbatim>
-    <property>
-        <name>hive.metastore.event.listeners</name>
-        <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
-        <description>event listeners that are notified of any metastore changes</description>
-    </property>
-
-    <property>
-        <name>hive.metastore.dml.events</name>
-        <value>true</value>
-    </property>
-</verbatim>
-
----++ Usage
----+++ Bootstrap
-   Perform initial bootstrap of Table and Database from source cluster to destination cluster
-   * *Database Bootstrap*
-     For bootstrapping DB replication, first destination DB should be created. This step
is expected,
-     since DB replication definitions can be set up by users only on pre-existing DB’s.
Second, Export all tables in
-     the source db and Import it in the destination db, as described in Table bootstrap.
-
-   * *Table Bootstrap*
-     For bootstrapping table replication, essentially after having turned on the !DbNotificationListener
-     on the source db, perform an Export of the table, distcp the Export over to the destination
-     warehouse and do an Import over there. Check the following [[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport][Hive
Export-Import]] for syntax details
-     and examples.
-     This will set up the destination table so that the events on the source cluster that
modify the table
-     will then be replicated.
-
----+++ Setup cluster definition
-   <verbatim>
-    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
-   </verbatim>
-
----+++ Update recipes properties
-   Copy Hive DR recipe properties, workflow and template file from $FALCON_HOME/data-mirroring/hive-disaster-recovery
to the accessible
-   directory path or to the recipe directory path (*falcon.recipe.path=<recipe directory
path>*). *"falcon.recipe.path"* must be specified
-   in Falcon conf client.properties. Now update the copied recipe properties file with required
attributes to replicate metadata and data from source cluster to
-   destination cluster for Hive DR.
-
-   * *Note : HiveDR on TDE encrypted clusters*
-   When submitting HiveDR recipe in a kerberos secured setup, it is possible that the source
and target staging directories
-   are encrypted using Transparent Data Encryption (TDE). If your cluster dirs are TDE encrypted,
please set
-   "tdeEncryptionEnabled=true" in the recipe properties file. Default value for this property
is "false".
-
----+++ Submit Hive DR recipe
-   After updating the recipe properties file with required attributes in directory path or
in falcon.recipe.path,
-   there are two ways of submitting the Hive DR recipe:
-
-   * 1. Specify Falcon recipe properties file through recipe command line.
-   <verbatim>
-       $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation HIVE_DISASTER_RECOVERY
-       -properties /cluster/hive-disaster-recovery.properties
-   </verbatim>
-
-   * 2. Use Falcon recipe path specified in Falcon conf client.properties .
-   <verbatim>
-       $FALCON_HOME/bin/falcon recipe -name hive-disaster-recovery -operation HIVE_DISASTER_RECOVERY
-   </verbatim>
-
-
-*Note:*
-   * Recipe properties file, workflow file and template file name must match to the recipe
name, it must be unique and in the same directory.
-   * If kerberos security is enabled on cluster, use the secure templates for Hive DR from
$FALCON_HOME/data-mirroring/hive-disaster-recovery .
-

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/HiveMirroring.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HiveMirroring.twiki b/docs/src/site/twiki/HiveMirroring.twiki
new file mode 100644
index 0000000..e28502a
--- /dev/null
+++ b/docs/src/site/twiki/HiveMirroring.twiki
@@ -0,0 +1,63 @@
+---+Hive Mirroring
+
+---++Overview
+Falcon provides feature to replicate Hive metadata and data events from source cluster to
destination cluster. This is supported for both secure and unsecure cluster through Falcon
extensions.
+
+---++Prerequisites
+Following is the prerequisites to use Hive Mirrroring
+
+   * *Hive 1.2.0+*
+   * *Oozie 4.2.0+*
+
+*Note:* Set following properties in hive-site.xml for replicating the Hive events on source
and destination Hive cluster:
+<verbatim>
+    <property>
+        <name>hive.metastore.event.listeners</name>
+        <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
+        <description>event listeners that are notified of any metastore changes</description>
+    </property>
+
+    <property>
+        <name>hive.metastore.dml.events</name>
+        <value>true</value>
+    </property>
+</verbatim>
+
+---++ Use Case
+* Replicate data/metadata of Hive DB & table from source to target cluster
+
+---++ Limitations
+* Currently Hive doesn't support create database, roles, views, offline tables, direct HDFS
writes without registering with metadata and Database/Table name mapping replication events.
Hence Hive mirroring extension cannot be used to replicate above mentioned events between
warehouses.
+
+---++ Usage
+---+++ Bootstrap
+   Perform initial bootstrap of Table and Database from source cluster to destination cluster
+   * *Database Bootstrap*
+     For bootstrapping DB replication, first destination DB should be created. This step
is expected,
+     since DB replication definitions can be set up by users only on pre-existing DB’s.
Second, Export all tables in
+     the source db and Import it in the destination db, as described in Table bootstrap.
+
+   * *Table Bootstrap*
+     For bootstrapping table replication, essentially after having turned on the !DbNotificationListener
+     on the source db, perform an Export of the table, distcp the Export over to the destination
+     warehouse and do an Import over there. Check the following [[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport][Hive
Export-Import]] for syntax details
+     and examples.
+     This will set up the destination table so that the events on the source cluster that
modify the table
+     will then be replicated.
+
+---+++  Setup source and destination clusters
+   <verbatim>
+    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
+   </verbatim>
+
+---+++ Hive mirroring extension properties
+   Extension artifacts are expected to be installed on HDFS at the path specified by "extension.store.uri"
in startup properties. hive-mirroring-properties.json file located at "<extension.store.uri>/hive-mirroring/META/hive-mirroring-properties.json"
lists all the required and optional parameters/arguments for scheduling Hive mirroring job.
+
+---+++ Submit and schedule Hive mirroring extension
+
+   <verbatim>
+    $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hive-mirroring -file
/process/definition.xml
+   </verbatim>
+
+   Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]]
for more details on usage of CLI and REST API's.
+

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/Recipes.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Recipes.twiki b/docs/src/site/twiki/Recipes.twiki
deleted file mode 100644
index b5faa1e..0000000
--- a/docs/src/site/twiki/Recipes.twiki
+++ /dev/null
@@ -1,85 +0,0 @@
----+ Falcon Recipes
-
----++ Overview
-
-A Falcon recipe is a static process template with parameterized workflow to realize a specific
use case. Recipes are
-defined in user space. Recipes will not have support for update or lifecycle management.
-
-For example:
-
-   * Replicating directories from one HDFS cluster to another (not timed partitions)
-   * Replicating hive metadata (database, table, views, etc.)
-   * Replicating between HDFS and Hive - either way
-   * Data masking etc.
-
----++ Proposal
-
-Falcon provides a Process abstraction that encapsulates the configuration for a user workflow
with scheduling
-controls. All recipes can be modeled as a Process with in Falcon which executes the user
workflow periodically. The
-process and its associated workflow are parameterized. The user will provide a properties
file with name value pairs
-that are substituted by falcon before scheduling it. Falcon translates these recipes as a
process entity by
-replacing the parameters in the workflow definition.
-
----++ Falcon CLI recipe support
-
-Falcon CLI functionality to support recipes has been added.
-[[falconcli/FalconCLI][Falcon CLI]] Recipe command usage is defined here.
-
-CLI accepts recipe option with a recipe name and optional tool and does the following:
-   * Validates the options; name option is mandatory and tool is optional and should be provided
if user wants to override the base recipe tool
-   * Looks for <name>-workflow.xml, <name>-template.xml and <name>.properties
file in the path specified by falcon.recipe.path in client.properties. If files cannot be
found then Falcon CLI will fail
-   * Invokes a Tool to substitute the properties in the templated process for the recipe.
By default invokes base tool if tool option is not passed. Tool is responsible for generating
process entity at the path specified by FalconCLI
-   * Validates the generated entity
-   * Submit and schedule this entity
-   * Generated process entity files are stored in tmp directory
-
----++ Base Recipe tool
-
-Falcon provides a base tool that recipes can override. Base Recipe tool does the following:
-   * Expects recipe template file path, recipe properties file path and path where process
entity to be submitted should be generated. Validates these arguments
-   * Validates the artifacts i.e. workflow and/or lib files specified in the recipe template
exists on local filesystem or HDFS at the specified path else returns error
-   * Copies if the artifacts exists on local filesystem
-      * If workflow is on local FS then falcon.recipe.workflow.path in recipe property file
is mandatory for it to be copied to HDFS. If templated process requires custom libs falcon.recipe.workflow.lib.path
property is mandatory for them to be copied from Local FS to HDFS. Recipe tool will copy the
local artifacts only if these properties are set in properties file
-   * Looks for the patten ##[A-Za-z0-9_.]*## in the templated process and substitutes it
with the properties. Process entity generated after the substitution is written to the empty
file passed by FalconCLI
-
----++ Recipe template file format
-
-   * Any templatized string should be in the format ##[A-Za-z0-9_.]*##.
-   * There should be a corresponding entry in the recipe properties file "falcon.recipe.<templatized-string>
= <value to be substituted>"
-
-<verbatim>
-Example: If the entry in recipe template is <workflow name="##workflow.name##"> there
should be a corresponding entry in the recipe properties file falcon.recipe.workflow.name=hdfs-dr-workflow
-</verbatim>
-
----++ Recipe properties file format
-
-   * Regular key value pair properties file
-   * Property key should be prefixed by "falcon.recipe."
-
-<verbatim>
-Example: falcon.recipe.workflow.name=hdfs-dr-workflow
-Recipe template will have <workflow name="##workflow.name##">. Recipe tool will look
for the patten ##workflow.name##
-and replace it with the property value "hdfs-dr-workflow". Substituted template will have
<workflow name="hdfs-dr-workflow">
-</verbatim>
-
----++ Metrics
-HDFS DR and Hive DR recipes will capture the replication metrics like TIMETAKEN, BYTESCOPIED,
COPY (number of files copied) for an
-instance and populate to the GraphDB.
-
----++ Managing the scheduled recipe process
-   * Scheduled recipe process is similar to regular process
-      * List : falcon entity -type process -name <recipe-process-name> -list
-      * Status : falcon entity -type process -name <recipe-process-name> -status
-      * Delete : falcon entity -type process -name <recipe-process-name> -delete
-
----++ Sample recipes
-
-   * Sample recipes are published in addons/recipes
-
----++ Types of recipes
-   * [[HDFSDR][HDFS Recipe]]
-   * [[HiveDR][HiveDR Recipe]]
-
----++ Packaging
-
-   * There is no packaging for recipes at this time but will be added soon.

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/DefineExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/DefineExtension.twiki b/docs/src/site/twiki/falconcli/DefineExtension.twiki
new file mode 100644
index 0000000..c260911
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/DefineExtension.twiki
@@ -0,0 +1,8 @@
+---+++Definition
+
+[[CommonCLI][Common CLI Options]]
+
+Definition of an extension. Outputs a JSON document describing the extension invocation parameters.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­definition ­name <extension­name>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/DescribeExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/DescribeExtension.twiki b/docs/src/site/twiki/falconcli/DescribeExtension.twiki
new file mode 100644
index 0000000..9f9895e
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/DescribeExtension.twiki
@@ -0,0 +1,8 @@
+---+++Describe
+
+[[CommonCLI][Common CLI Options]]
+
+Description of an extension. Outputs the README of the specified extension.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­describe ­name <extension­name>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/EnumerateExtension.twiki b/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
new file mode 100644
index 0000000..0b28630
--- /dev/null
+++ b/docs/src/site/twiki/falconcli/EnumerateExtension.twiki
@@ -0,0 +1,8 @@
+---+++Enumerate
+
+[[CommonCLI][Common CLI Options]]
+
+List all the extensions supported. Returns total number of results and a list of server side
extensions supported.
+
+Usage:
+$FALCON_HOME/bin/falcon extension ­enumerate
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/falconcli/FalconCLI.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/falconcli/FalconCLI.twiki b/docs/src/site/twiki/falconcli/FalconCLI.twiki
index 2290569..dedd40c 100644
--- a/docs/src/site/twiki/falconcli/FalconCLI.twiki
+++ b/docs/src/site/twiki/falconcli/FalconCLI.twiki
@@ -11,7 +11,8 @@ CLI options are classified into :
    * <a href="#Instance_Management_Commands">Instance Management Commands</a>
    * <a href="#Metadata_Commands">Metadata Commands</a>
    * <a href="#Admin_Commands">Admin commands</a>
-   * <a href="#Recipe_Commands">Recipe commands</a>
+   * <a href="#Extension_Artifacts_Commands">Extension artifacts commands</a>
+   * <a href="#Extension_Commands">Extension commands</a>
 
 
 
@@ -104,10 +105,19 @@ $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.x
 
 -----------
 
----++Recipe Commands
+---++Extension artifacts management Commands
 
 | *Command*                                      | *Description*                        
          |
-|[[SubmitRecipe][Submit]]                        | Submit the specified Recipe          
          |
+|[[EnumerateExtension][Enumerate]]               | Return all the extensions supported  
          |
+|[[DescribeExtension][Describe]]                 | Return description of an extension   
          |
+|[[DefineExtension][Definition]]                 | Return the definition of an extension
          |
+
+-----------
+
+---++Extension Commands
+
+| *Command*                                      | *Description*                        
          |
+|[[SubmitExtension][Submit]]                     | Submit the specified extension       
          |
 
 
 

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionDefinition.twiki b/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
new file mode 100644
index 0000000..66f6674
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionDefinition.twiki
@@ -0,0 +1,160 @@
+---++  GET api/extension/definition/:extension­name
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Get definition of the extension.
+
+---++ Parameters
+   * :extension­name Name of the extension.
+
+---++ Results
+Outputs a JSON document describing the extension invocation parameters.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/definition/hdfs­mirroring
+</verbatim>
+---+++ Result
+<verbatim>
+{
+    "shortDescription": "This extension implements replicating arbitrary directories on HDFS
from one Hadoop cluster to another Hadoop cluster. This piggy backs on replication solution
in Falcon which uses the DistCp tool.",
+    "properties":[
+        {
+            "propertyName":"jobName",
+            "required":true,
+            "description":"Unique job name",
+            "example":"hdfs-monthly-sales-dr"
+        },
+        {
+            "propertyName":"jobClusterName",
+            "required":true,
+            "description":"Cluster where job should run",
+            "example":"backupCluster"
+        },
+        {
+            "propertyName":"jobValidityStart",
+            "required":true,
+            "description":"Job validity start time",
+            "example":"2016-03-03T00:00Z"
+        },
+        {
+            "propertyName":"jobValidityEnd",
+            "required":true,
+            "description":"Job validity end time",
+            "example":"2018-03-13T00:00Z"
+        },
+        {
+            "propertyName":"jobFrequency",
+            "required":true,
+            "description":"job frequency. Valid frequency types are minutes, hours, days,
months",
+            "example":"months(1)"
+        },
+        {
+            "propertyName":"jobTimezone",
+            "required":false,
+            "description":"Time zone for the job",
+            "example":"GMT"
+        },
+        {
+            "propertyName":"jobTags",
+            "required":false,
+            "description":"list of comma separated tags. Key Value Pairs, separated by comma",
+            "example":"consumer=consumer@xyz.com, owner=producer@xyz.com, _department_type=forecasting"
+        },
+        {
+            "propertyName":"jobRetryPolicy",
+            "required":false,
+            "description":"Job retry policy",
+            "example":"periodic"
+        },
+        {
+            "propertyName":"jobRetryDelay",
+            "required":false,
+            "description":"Job retry delay",
+            "example":"minutes(30)"
+        },
+        {
+            "propertyName":"jobRetryAttempts",
+            "required":false,
+            "description":"Job retry attempts",
+            "example":"3"
+        },
+        {
+            "propertyName":"jobRetryOnTimeout",
+            "required":false,
+            "description":"Job retry on timeout",
+            "example":"true"
+        },
+        {
+            "propertyName":"jobAclOwner",
+            "required":false,
+            "description":"ACL owner",
+            "example":"ambari-qa"
+        },
+        {
+            "propertyName":"jobAclGroup",
+            "required":false,
+            "description":"ACL group",
+            "example":"users"
+        },
+        {
+            "propertyName":"jobAclPermission",
+            "required":false,
+            "description":"ACL permission",
+            "example":"0x755"
+        },
+        {
+            "propertyName":"sourceDir",
+            "required":true,
+            "description":"Multiple hdfs comma separated source directories",
+            "example":"/user/ambari-qa/primaryCluster/dr/input1, /user/ambari-qa/primaryCluster/dr/input2"
+        },
+        {
+            "propertyName":"sourceCluster",
+            "required":true,
+            "description":"Source cluster for hdfs mirroring",
+            "example":"primaryCluster"
+        },
+        {
+            "propertyName":"targetDir",
+            "required":true,
+            "description":"Target hdfs directory",
+            "example":"/user/ambari-qa/backupCluster/dr"
+        },
+        {
+            "propertyName":"targetCluster",
+            "required":true,
+            "description":"Target cluster for hdfs mirroring",
+            "example":"backupCluster"
+        },
+        {
+            "propertyName":"distcpMaxMaps",
+            "required":false,
+            "description":"Maximum number of mappers for DistCP",
+            "example":"1"
+        },
+        {
+            "propertyName":"distcpMapBandwidth",
+            "required":false,
+            "description":"Bandwidth in MB for each mapper in DistCP",
+            "example":"100"
+        },
+        {
+            "propertyName":"jobNotificationType",
+            "required":false,
+            "description":"Email Notification for Falcon instance completion",
+            "example":"email"
+        },
+        {
+            "propertyName":"jobNotificationReceivers",
+            "required":false,
+            "description":"Comma separated email Id's",
+            "example":"user1@gmail.com, user2@gmail.com"
+        }
+    ]
+}
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionDescription.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionDescription.twiki b/docs/src/site/twiki/restapi/ExtensionDescription.twiki
new file mode 100644
index 0000000..5900fbb
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionDescription.twiki
@@ -0,0 +1,24 @@
+---++  GET api/extension/describe/:extension­name
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Description of an extension.
+
+---++ Parameters
+   * :extension­name Name of the extension.
+
+---++ Results
+Outputs the README of the specified extension.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/describe/hdfs­mirroring
+</verbatim>
+---+++ Result
+<verbatim>
+<README file of the specified extension>
+</verbatim>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki b/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
new file mode 100644
index 0000000..abd94c8
--- /dev/null
+++ b/docs/src/site/twiki/restapi/ExtensionEnumeration.twiki
@@ -0,0 +1,38 @@
+---++  GET api/extension/enumerate
+   * <a href="#Description">Description</a>
+   * <a href="#Parameters">Parameters</a>
+   * <a href="#Results">Results</a>
+   * <a href="#Examples">Examples</a>
+
+---++ Description
+Get list of the supported extensions.
+
+---++ Parameters
+None
+
+---++ Results
+Total number of results and a list of extension server extensions supported.
+
+---++ Examples
+---+++ Rest Call
+<verbatim>
+GET http://localhost:15000/api/extension/enumerate
+</verbatim>
+---+++ Result
+<verbatim>
+{
+    "totalResults":"2”,
+    “extensions”: [
+        {
+            “name”: “Hdfs­mirroring”
+            “type”: “Trusted/Provided extension”
+            “description”: “This extension implements replicating arbitrary directories
on HDFS from one Hadoop cluster to another Hadoop cluster.”
+        },
+        {
+            “name”: “Hive­mirroring”
+            “type”: “Trusted/Provided extension”
+            “description”: “This extension implements replicating hive metadata and
data from one Hadoop cluster to another Hadoop cluster.”
+        }
+    ]
+}
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/85345ad7/docs/src/site/twiki/restapi/ResourceList.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/restapi/ResourceList.twiki b/docs/src/site/twiki/restapi/ResourceList.twiki
index 34c2c6f..f703843 100644
--- a/docs/src/site/twiki/restapi/ResourceList.twiki
+++ b/docs/src/site/twiki/restapi/ResourceList.twiki
@@ -6,6 +6,7 @@
    * <a href="#REST_Call_on_Admin_Resource">REST Call on Admin Resource</a>
    * <a href="#REST_Call_on_Lineage_Graph">REST Call on Lineage Graph Resource</a>
    * <a href="#REST_Call_on_Metadata_Resource">REST Call on Metadata Resource</a>
+   * <a href="#REST_Call_on_Extension_Artifact">REST Call on Extension artifact</a>
 
 ---++ Authentication
 
@@ -88,6 +89,13 @@ The current version of the rest api's documentation is also hosted on the
Falcon
 
 ---++ REST Call on Metadata Discovery Resource
 
-| *Call Type* | *Resource*                                                              
                      | *Description*                                                    
            |
-| GET         | [[MetadataList][api/metadata/discovery/:dimension-type/list]]           
                      | list of dimensions  |
-| GET         | [MetadataRelations][api/metadata/discovery/:dimension-type/:dimension-name/relations]]
        | Return all relations of a dimension |
+| *Call Type* | *Resource*                                                              
                      | *Description*                       |
+| GET         | [[MetadataList][api/metadata/discovery/:dimension-type/list]]           
                      | list of dimensions                  |
+| GET         | [[MetadataRelations][api/metadata/discovery/:dimension-type/:dimension-name/relations]]
       | Return all relations of a dimension |
+
+---++ REST Call on Extension Artifact
+
+| *Call Type* | *Resource*                                                        | *Description*
                                                         |
+| GET         | [[ExtensionEnumeration][api/extension/enumerate]]                 | List
all the extensions supported                                      |
+| GET         | [[ExtensionDescription][api/extension/describe/:extension­name]]  | Return
the README of the specified extension                           |
+| GET         | [[ExtensionDefinition][api/extension/definition/:extension­name]] | Return
a JSON document describing the extension invocation parameters  |


Mime
View raw message