tez-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hit...@apache.org
Subject [2/2] tez git commit: TEZ-1822. Docs for Timeline/ACLs/HistoryText. (hitesh)
Date Mon, 08 Dec 2014 19:17:12 GMT
TEZ-1822. Docs for Timeline/ACLs/HistoryText. (hitesh)


Project: http://git-wip-us.apache.org/repos/asf/tez/repo
Commit: http://git-wip-us.apache.org/repos/asf/tez/commit/b8ddbefc
Tree: http://git-wip-us.apache.org/repos/asf/tez/tree/b8ddbefc
Diff: http://git-wip-us.apache.org/repos/asf/tez/diff/b8ddbefc

Branch: refs/heads/master
Commit: b8ddbefcfc051eb0ef6c2f27a06bdfbb87c220a1
Parents: a5cdae4
Author: Hitesh Shah <hitesh@apache.org>
Authored: Mon Dec 8 11:15:21 2014 -0800
Committer: Hitesh Shah <hitesh@apache.org>
Committed: Mon Dec 8 11:15:21 2014 -0800

----------------------------------------------------------------------
 CHANGES.txt                                 |  1 +
 docs/src/site/markdown/tez_acls.md          | 65 ++++++++++++++++++++++
 docs/src/site/markdown/tez_ui_user_data.md  | 52 ++++++++++++++++++
 docs/src/site/markdown/tez_yarn_timeline.md | 69 ++++++++++++++++++++++++
 docs/src/site/markdown/user_guides.md       | 25 +++++++++
 docs/src/site/site.xml                      |  1 +
 6 files changed, 213 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/CHANGES.txt
----------------------------------------------------------------------
diff --git a/CHANGES.txt b/CHANGES.txt
index 82be4bb..64c4ccd 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -6,6 +6,7 @@ Release 0.6.0: Unreleased
 INCOMPATIBLE CHANGES
 
 ALL CHANGES:
+  TEZ-1822. Docs for Timeline/ACLs/HistoryText.
   TEZ-1252. Change wording on http://tez.apache.org/team-list.html related to member confusion.
   TEZ-1805. Tez client DAG cycle detection should detect self loops
   TEZ-1816. It is possible to receive START event when DAG is failed

http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/docs/src/site/markdown/tez_acls.md
----------------------------------------------------------------------
diff --git a/docs/src/site/markdown/tez_acls.md b/docs/src/site/markdown/tez_acls.md
new file mode 100644
index 0000000..447d8a8
--- /dev/null
+++ b/docs/src/site/markdown/tez_acls.md
@@ -0,0 +1,65 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<head><title>Access Control in Tez</title></head>
+
+## Background
+
+Access control in Tez can be categorized as follows:
+
+  - Modify permissions on the Tez AM ( or Session ). Users with this permision can:
+    - Submit a DAG to a Tez Session
+    - Kill any DAG within the given AM/Session
+    - Kill the Session
+  - View permissions on the Tez AM ( or Session ). Users with this permision can:
+    - Monitor/View the status of the Session
+    - Monitor/View the progress/status of any DAG within the given AM/Session
+  - Modify permissions on a particular Tez DAG. Users with this permision can:
+    - Kill the DAG
+  - View permissions on a particular Tez DAG. Users with this permision can:
+    - Monitor/View the progress/status of the DAG
+
+From above, you can see that All users/groups that have access to do operations on the AM
also have access to similar operations on all DAGs within that AM/session. Also, by default,
the owner of the Tez AM,  i.e. the user who started the Tez AM, is considered a super-user
and has access to all operations on the AM as well as all DAGs within the AM/Session.
+
+## How to setup the ACLs
+
+By default, ACLs are always enabled in Tez. To disable ACLs, set the following configuration
property:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.am.acls.enabled&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;false&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+
+### AM/Session Level ACLs
+
+AM/Session level ACLs are driven by configuration. To setup the ACLs, the following properties
need to be defined:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.am.view-acls&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.am.modify-acls&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+
+The format of the value is a comma-separated list of users and groups with the users and
groups separated by a single whitespace. e.g. "user1,user2 group1,group2". To allow all users
to do a given operation, the value "*" can be specified.
+
+### DAG ACLs
+
+In certain scenarios, applications may need DAGs running within a given Session to have different
access permissions. In such cases, the ACLs for each DAG can be specified programmatically
via the DAG API. Look for DAG::setAccessControls in the API docs for the Tez release that
you are using.
+In this scenario, it is important to note that the Session ACLs should be defined with only
super-users specified to ensure that other users do not inadvertently gain access to information
for all DAGs within the given Session.

http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/docs/src/site/markdown/tez_ui_user_data.md
----------------------------------------------------------------------
diff --git a/docs/src/site/markdown/tez_ui_user_data.md b/docs/src/site/markdown/tez_ui_user_data.md
new file mode 100644
index 0000000..71ffde3
--- /dev/null
+++ b/docs/src/site/markdown/tez_ui_user_data.md
@@ -0,0 +1,52 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<head><title>Embedding Application Specific Data into Tez UI</title></head>
+
+# Embedding Application Specific Data into Tez UI
+
+The Tez UI is built mainly off data stored in [YARN Timeline]. The Tez API, currently, provides
some minimal support for an application to inject application-specific data into the same
storage layer. Using a general standard guideline by following a well-defined format, this
data can also be displayed in the Tez UI. 
+
+## Setting DAG-level Information
+
+To set DAG level information, the API to use is DAG::setDAGInfo.  ( Please refer to the Javadocs
for more detailed and up-to-date information )
+
+The DAG::setDAGInfo() API expects a to be String passed to it. This string is recommended
to be a json-encoded value with the following keys recognized keys by the UI:
+  - "context": The application context in which this DAG is being used. For example, this
could be set to "Hive" or "Pig" if this is being run as part of a Hive or Pig script.
+  - "description": General description on what this DAG is going to do. In the case of Hive,
this could be the SQL query text.
+
+## Setting Information for each Input/Output/Processor
+
+Each Input/Output/Processor specified in the DAG Plan is specified via a TezEntityDescriptor.
Applications specify a user payload that is used to initialize/configure the instance as needed.
From a Tez UI point of view, users are usually keen to understand what "work" the particular
Input/Output/Processor is doing in addition to any additional configuration information on
how the object was initialized/configured. Keeping that in mind, each TezEntityDescriptor
supports an api for application developers to specify this information when creating the DAG
plan. The API to use for this is setHistoryText(). 
+
+The setHistoryText() API expects a String to be passed to it. This string is recommended
to be a json-encoded value with the following keys recognized keys by the UI:
+  - "desc" : A simple string describing for the object in question. For example, for a particular
Hive Processor, this could be a description of what that particular processor is doing.
+  - "config" : A map of key-value pairs representing the configuration/payload used to initialize
the object in question.
+
+By default, the Inputs/Outputs/Processors that are part of the tez-runtime-library do not
publish their configuration information via the setHistoryText() API. To enable this, the
following property needs to be enabled:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.runtime.convert.user-payload.to.history-text&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;true&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+
+## Use of DAG Info and History Text in the Tez UI
+
+If the data setup in the DAG Info and History Text conforms to the format expected by the
UI, it will be displayed in the Tez UI in an easy to consume manner. In cases where this is
not possible, the UI may fall back to either not displaying the data at all or displaying
the string as is in a safe manner. 
+
+[YARN Timeline]:./tez_yarn_timeline.html
+

http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/docs/src/site/markdown/tez_yarn_timeline.md
----------------------------------------------------------------------
diff --git a/docs/src/site/markdown/tez_yarn_timeline.md b/docs/src/site/markdown/tez_yarn_timeline.md
new file mode 100644
index 0000000..0ab4fae
--- /dev/null
+++ b/docs/src/site/markdown/tez_yarn_timeline.md
@@ -0,0 +1,69 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<head><title>Using YARN Timeline with Tez for History</title></head>
+
+## YARN Timeline Background
+
+Initial support for [YARN Timeline](http://hadoop.apache.org/docs/r2.4.0/hadoop-yarn/hadoop-yarn-site/TimelineServer.html)
was introduced in Apache Hadoop 2.4.0. Support for ACLs in Timeline was introduced in Apache
Hadoop 2.6.0. Support for Timeline was introduced in Tez in 0.5.x ( with some experimental
support in 0.4.x )
+
+## How Tez Uses YARN Timeline
+
+Tez uses YARN Timeline as its application history store. Tez stores most of its lifecycle
information into this history store such as:
+  - DAG information such as:
+    - DAG Plan
+    - DAG Submission, Start and End times
+    - DAG Counters
+    - Final status of the DAG and additional diagnostics
+  - Vertex, Task and Task Attempt Information
+    - Start and End times
+    - Counters
+    - Diagnostics
+
+Using the above information, a user can analyze a Tez DAG while it is running and after it
has completed.
+
+## YARN Timeline and Hadoop Versions
+
+Given that the support for YARN Timeline with full security was only realized in Apache Hadoop
2.6.0, some features may or may not be supported depending on which version of Apache Hadoop
is used.
+
+
+|  | Hadoop 2.2.x, 2.3.x | Hadoop 2.4.x, 2.5.x | Hadoop 2.6.x and higher |
+| ------- | ----- | ----- | ----- |
+| Timeline Support | No | Yes | Yes |
+| Timeline with ACLs Support | No | No | Yes |
+
+## Configuring Tez to use YARN Timeline
+
+By default, Tez writes its history data into a file on HDFS. To use Timeline, add the following
property into your tez-site.xml:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.history.logging.service.class&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+
+For Tez 0.4.x, the above property is not respected. For 0.4.x, please set the following property:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.yarn.ats.enabled&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;true&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>
+
+When using Tez with Apache Hadoop 2.4.x or 2.5.x, given that these versions are not fully
secure, the following property also needs to be enabled:
+
+> &lt;property&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;name&gt;tez.allow.disabled.timeline-domains&lt;/name&gt;<br/>
+> &nbsp;&nbsp;&nbsp;&lt;value&gt;true&lt;/value&gt;<br/>
+> &lt;/property&gt;<br/>

http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/docs/src/site/markdown/user_guides.md
----------------------------------------------------------------------
diff --git a/docs/src/site/markdown/user_guides.md b/docs/src/site/markdown/user_guides.md
new file mode 100644
index 0000000..959f0e5
--- /dev/null
+++ b/docs/src/site/markdown/user_guides.md
@@ -0,0 +1,25 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<head><title>User Guides for various Tez features</title></head>
+
+# User Guides and Documentation for various Tez features
+
+   - [Using YARN Timeline with Tez for History](./tez_yarn_timeline.html)
+   - [Access Control in Tez](./tez_acls.html)
+   - [Embedding Application Specific Data into Tez UI](./tez_ui_user_data.html)
+

http://git-wip-us.apache.org/repos/asf/tez/blob/b8ddbefc/docs/src/site/site.xml
----------------------------------------------------------------------
diff --git a/docs/src/site/site.xml b/docs/src/site/site.xml
index eba2e05..d1addc7 100644
--- a/docs/src/site/site.xml
+++ b/docs/src/site/site.xml
@@ -100,6 +100,7 @@
     <menu name="Documentation">
       <item name="Install Guide" href="install.html"/>
       <item name="Local Mode" href="localmode.html"/>
+      <item name="User Guides" href="user_guides.html"/>
     </menu>
 
     <menu name="Community">


Mime
View raw message