falcon-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From venkat...@apache.org
Subject git commit: FALCON-311 Several dead links in Falcon documentation. Contributed by Suresh Srinivas
Date Thu, 20 Feb 2014 22:37:30 GMT
Repository: incubator-falcon
Updated Branches:
  refs/heads/master 027efd592 -> 1af2e92fd


FALCON-311 Several dead links in Falcon documentation. Contributed by Suresh Srinivas


Project: http://git-wip-us.apache.org/repos/asf/incubator-falcon/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-falcon/commit/1af2e92f
Tree: http://git-wip-us.apache.org/repos/asf/incubator-falcon/tree/1af2e92f
Diff: http://git-wip-us.apache.org/repos/asf/incubator-falcon/diff/1af2e92f

Branch: refs/heads/master
Commit: 1af2e92fda1859d0360c65db7e14e4d8247d826f
Parents: 027efd5
Author: Venkatesh Seetharam <venkatesh@hortonworks.com>
Authored: Thu Feb 20 14:37:01 2014 -0800
Committer: Venkatesh Seetharam <venkatesh@hortonworks.com>
Committed: Thu Feb 20 14:37:27 2014 -0800

----------------------------------------------------------------------
 CHANGES.txt                                   |  3 +++
 docs/src/site/twiki/EntitySpecification.twiki | 14 +++++++-------
 docs/src/site/twiki/FalconDocumentation.twiki |  8 ++++----
 docs/src/site/twiki/HiveIntegration.twiki     |  2 +-
 docs/src/site/twiki/OnBoarding.twiki          |  4 ++--
 5 files changed, 17 insertions(+), 14 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-falcon/blob/1af2e92f/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index 67cfede..41882b1 100755
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -17,6 +17,9 @@ Trunk (Unreleased)
     FALCON-238 Support updates at specific time. (Shwetha GS)
    
   IMPROVEMENTS
+    FALCON-311 Several dead links in Falcon documentation.
+    (Suresh Srinivas via Venkatesh Seetharam)
+
     FALCON-304 Simplify assembly for script in standalone and distributed
     mode. (Suresh Srinivas via Venkatesh Seetharam)
 

http://git-wip-us.apache.org/repos/asf/incubator-falcon/blob/1af2e92f/docs/src/site/twiki/EntitySpecification.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/EntitySpecification.twiki b/docs/src/site/twiki/EntitySpecification.twiki
index f2cac4a..3a9820e 100644
--- a/docs/src/site/twiki/EntitySpecification.twiki
+++ b/docs/src/site/twiki/EntitySpecification.twiki
@@ -35,7 +35,7 @@ using the same write interface.
 <interface type="execute" endpoint="localhost:8021" version="0.20.2" />
 </verbatim>
 An execute interface specifies the interface for job tracker, it's endpoint is the value
of mapred.job.tracker. 
-Falcon uses this interface to submit the processes as jobs on JobTracker defined here.
+Falcon uses this interface to submit the processes as jobs on !JobTracker defined here.
 
 <verbatim>
 <interface type="workflow" endpoint="http://localhost:11000/oozie/" version="3.1" />
@@ -176,7 +176,7 @@ Examples:
 </verbatim>
 A feed can define multiple partitions, if a referenced cluster defines partitions then the
number of partitions in feed has to be equal to or more than the cluster partitions.
 
-*Note:* This will only apply for FileSystem storage but not Table storage as partitions are
defined and maintained in
+*Note:* This will only apply for !FileSystem storage but not Table storage as partitions
are defined and maintained in
 Hive (Hcatalog) registry.
 
 ---+++ Groups
@@ -215,7 +215,7 @@ A late-arrival specifies the cut-off period till which the feed is expected
to a
 The cut-off period is specified by expression frequency(times), ex: if the feed can arrive
late
 upto 8 hours then late-arrival's cut-off="hours(8)"
 
-*Note:* This will only apply for FileSystem storage but not Table storage until a future
time.
+*Note:* This will only apply for !FileSystem storage but not Table storage until a future
time.
 
 ---++++ Custom Properties
 
@@ -471,7 +471,7 @@ Example:
 </process>
 </verbatim>
 
-*Note:* This is only supported for FileSystem storage but not Table storage at this point.
+*Note:* This is only supported for !FileSystem storage but not Table storage at this point.
 
 
 ---++++ Outputs
@@ -594,8 +594,8 @@ There are 2 engines supported today.
 ---+++++ Oozie
 
 As part of oozie workflow engine support, users can embed a oozie workflow.
-Refer to oozie [[http://incubator.apache.org/oozie/overview.html][workflow overview]] and
-[[http://incubator.apache.org/oozie/docs/3.1.3/docs/WorkflowFunctionalSpec.html][workflow
specification]] for details.
+Refer to oozie [[http://oozie.apache.org/docs/3.1.3-incubating/DG_Overview.html][workflow
overview]] and
+[[http://oozie.apache.org/docs/3.1.3-incubating/WorkflowFunctionalSpec.html][workflow specification]]
for details.
 
 Syntax:
 <verbatim>
@@ -720,4 +720,4 @@ Example:
 </verbatim>
 This late handling specifies that late data detection should run at feed's late cut-off which
is 6 hours in this case. If there is late data, Falcon should run the workflow specified at
/projects/bootcamp/workflow/lateinput1/workflow.xml
 
-*Note:* This is only supported for FileSystem storage but not Table storage at this point.
+*Note:* This is only supported for !FileSystem storage but not Table storage at this point.

http://git-wip-us.apache.org/repos/asf/incubator-falcon/blob/1af2e92f/docs/src/site/twiki/FalconDocumentation.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/FalconDocumentation.twiki b/docs/src/site/twiki/FalconDocumentation.twiki
index 2b4b3fd..4603709 100644
--- a/docs/src/site/twiki/FalconDocumentation.twiki
+++ b/docs/src/site/twiki/FalconDocumentation.twiki
@@ -351,7 +351,7 @@ for a partition when the partition is complete at the source.
    * Falcon will use HCatalog (Hive) API to export the data for a given table and the partition,
 which will result in a data collection that includes metadata on the data's storage format,
the schema,
 how the data is sorted, what table the data came from, and values of any partition keys from
that table.
-   * Falcon will use DistCp tool to copy the exported data collection into the secondary
cluster into a staging
+   * Falcon will use discp tool to copy the exported data collection into the secondary cluster
into a staging
 directory used by Falcon.
    * Falcon will then import the data into HCatalog (Hive) using the HCatalog (Hive) API.
If the specified table does
 not yet exist, Falcon will create it, using the information in the imported metadata to set
defaults for the table
@@ -526,7 +526,7 @@ instance. From the perspective of late handling, there are two main configuratio
 and late-inputs section in feed and process entity definition that are central. These configurations
govern
 how and when the late processing happens. In the current implementation (oozie based) the
late handling is very
 simple and basic. The falcon system looks at all dependent input feeds for a process and
computes the max late
-cut-off period. Then it uses a scheduled messaging framework, like the one available in Apache
ActiveMQ or Java's DelayQueue to schedule a message with a cut-off period, then after a cut-off
period the message is dequeued and Falcon checks for changes in the feed data which is recorded
in HDFS in latedata file by falcons "record-size" action, if it detects any changes then the
workflow will be rerun with the new set of feed data.
+cut-off period. Then it uses a scheduled messaging framework, like the one available in Apache
ActiveMQ or Java's !DelayQueue to schedule a message with a cut-off period, then after a cut-off
period the message is dequeued and Falcon checks for changes in the feed data which is recorded
in HDFS in latedata file by falcons "record-size" action, if it detects any changes then the
workflow will be rerun with the new set of feed data.
 
 *Example:*
 The late rerun policy can be configured in the process definition.
@@ -607,12 +607,12 @@ Users may register consumers on the required topic to check the availability
or
  
 For a given process that is scheduled, the name of the topic is same as the process name.
 Falcon sends a Map message for every feed produced by the instance of a process to the JMS
topic.
-The JMS MapMessage sent to a topic has the following properties:
+The JMS !MapMessage sent to a topic has the following properties:
 entityName, feedNames, feedInstancePath, workflowId, runId, nominalTime, timeStamp, brokerUrl,
brokerImplClass, entityType, operation, logFile, topicName, status, brokerTTL;
 
 For a given feed that is scheduled, the name of the topic is same as the feed name.
 Falcon sends a map message for every feed instance that is deleted/archived/replicated depending
upon the retention policy set in the feed definition.
-The JMS MapMessage sent to a topic has the following properties:
+The JMS !MapMessage sent to a topic has the following properties:
 entityName, feedNames, feedInstancePath, workflowId, runId, nominalTime, timeStamp, brokerUrl,
brokerImplClass, entityType, operation, logFile, topicName, status, brokerTTL;
 
 The JMS messages are automatically purged after a certain period (default 3 days) by the
Falcon JMS house-keeping service.TTL (Time-to-live) for JMS message

http://git-wip-us.apache.org/repos/asf/incubator-falcon/blob/1af2e92f/docs/src/site/twiki/HiveIntegration.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HiveIntegration.twiki b/docs/src/site/twiki/HiveIntegration.twiki
index 0ee571e..774a831 100644
--- a/docs/src/site/twiki/HiveIntegration.twiki
+++ b/docs/src/site/twiki/HiveIntegration.twiki
@@ -72,7 +72,7 @@ HCatalog server. If this is absent, no HCatalog publication will be done
from Fa
    * Falcon will use HCatalog (Hive) API to export the data for a given table and the partition,
 which will result in a data collection that includes metadata on the data's storage format,
the schema,
 how the data is sorted, what table the data came from, and values of any partition keys from
that table.
-   * Falcon will use DistCp tool to copy the exported data collection into the secondary
cluster into a staging
+   * Falcon will use discp tool to copy the exported data collection into the secondary cluster
into a staging
 directory used by Falcon.
    * Falcon will then import the data into HCatalog (Hive) using the HCatalog (Hive) API.
If the specified table does
 not yet exist, Falcon will create it, using the information in the imported metadata to set
defaults for the

http://git-wip-us.apache.org/repos/asf/incubator-falcon/blob/1af2e92f/docs/src/site/twiki/OnBoarding.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/OnBoarding.twiki b/docs/src/site/twiki/OnBoarding.twiki
index 4fa5893..fd5bec7 100644
--- a/docs/src/site/twiki/OnBoarding.twiki
+++ b/docs/src/site/twiki/OnBoarding.twiki
@@ -7,7 +7,7 @@
    * Create cluster definition for the cluster, specifying name node, job tracker, workflow
engine endpoint, messaging endpoint. Refer to [[EntitySpecification][cluster definition]]
for details.
    * Create Feed definitions for each of the input and output specifying frequency, data
path, ownership. Refer to [[EntitySpecification][feed definition]] for details.
    * Create Process definition for your job. Process defines configuration for the workflow
job. Important attributes are frequency, inputs/outputs and workflow path. Refer to [[EntitySpecification][process
definition]] for process details.
-   * Define workflow for your job using the workflow engine(only oozie is supported as of
now). Refer [[http://incubator.apache.org/oozie/docs/3.1.3/docs/WorkflowFunctionalSpec.html][Oozie
Workflow Specification]]. The libraries required for the workflow should be available in lib
folder in workflow path.
+   * Define workflow for your job using the workflow engine(only oozie is supported as of
now). Refer [[http://oozie.apache.org/docs/3.1.3-incubating/WorkflowFunctionalSpec.html][Oozie
Workflow Specification]]. The libraries required for the workflow should be available in lib
folder in workflow path.
    * Set-up workflow definition, libraries and referenced scripts on hadoop. 
    * Submit cluster definition
    * Submit and schedule feed and process definitions
@@ -114,7 +114,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 </verbatim>
 
 ---++++ Process
-Sample process which runs daily at 6th hour on corp cluster. It takes one input - SampleInput
for the previous day(24 instances). It generates one output - SampleOutput for previous day.
The workflow is defined at /projects/bootcamp/workflow/workflow.xml. Any libraries available
for the workflow should be at /projects/bootcamp/workflow/lib. The process also defines properties
queueName, ssh.host, and fileTimestamp which are passed to the workflow. In addition, Falcon
exposes the following properties to the workflow: nameNode, jobTracker(hadoop properties),
input and output(Input/Output properties).
+Sample process which runs daily at 6th hour on corp cluster. It takes one input - !SampleInput
for the previous day(24 instances). It generates one output - !SampleOutput for previous day.
The workflow is defined at /projects/bootcamp/workflow/workflow.xml. Any libraries available
for the workflow should be at /projects/bootcamp/workflow/lib. The process also defines properties
queueName, ssh.host, and fileTimestamp which are passed to the workflow. In addition, Falcon
exposes the following properties to the workflow: nameNode, jobTracker(hadoop properties),
input and output(Input/Output properties).
 
 <verbatim>
 <?xml version="1.0" encoding="UTF-8"?>


Mime
View raw message