airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From davy...@apache.org
Subject incubator-airflow git commit: [AIRFLOW-1443] Update Airflow configuration documentation
Date Wed, 09 Aug 2017 21:50:21 GMT
Repository: incubator-airflow
Updated Branches:
  refs/heads/master d9109d645 -> 6825d97b8


[AIRFLOW-1443] Update Airflow configuration documentation

This PR updates Airflow configuration
documentations to include a recent change to split
task logs by try number #2383.

Closes #2467 from AllisonWang/allison--update-doc


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/6825d97b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/6825d97b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/6825d97b

Branch: refs/heads/master
Commit: 6825d97b82a3b235685ea8265380a20eea90c990
Parents: d9109d6
Author: AllisonWang <allisonwang520@gmail.com>
Authored: Wed Aug 9 14:49:54 2017 -0700
Committer: Dan Davydov <dan.davydov@airbnb.com>
Committed: Wed Aug 9 14:49:56 2017 -0700

----------------------------------------------------------------------
 UPDATING.md            | 29 ++++++++++++++++-------------
 docs/configuration.rst | 15 ++++++++-------
 2 files changed, 24 insertions(+), 20 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/6825d97b/UPDATING.md
----------------------------------------------------------------------
diff --git a/UPDATING.md b/UPDATING.md
index a02ff04..3a880ab 100644
--- a/UPDATING.md
+++ b/UPDATING.md
@@ -9,8 +9,11 @@ assists people when migrating to a new version.
   SSH Hook now uses Paramiko library to create ssh client connection, instead of sub-process
based ssh command execution previously (<1.9.0), so this is backward incompatible.
   - update SSHHook constructor
   - use SSHOperator class in place of SSHExecuteOperator which is removed now. Refer test_ssh_operator.py
for usage info.
-  - SFTPOperator is added to perform secure file transfer from serverA to serverB. Refer
test_sftp_operator.py.py for usage info. 
-  - No updates are required if you are using ftpHook, it will continue work as is. 
+  - SFTPOperator is added to perform secure file transfer from serverA to serverB. Refer
test_sftp_operator.py.py for usage info.
+  - No updates are required if you are using ftpHook, it will continue work as is.
+
+### Logging update
+  Logs now are stored in the log folder as ``{dag_id}/{task_id}/{execution_date}/{try_number}.log``.
 
 ### New Features
 
@@ -61,8 +64,8 @@ interfere.
 Please read through these options, defaults have changed since 1.7.1.
 
 #### child_process_log_directory
-In order the increase the robustness of the scheduler, DAGS our now processed in their own
process. Therefore each 
-DAG has its own log file for the scheduler. These are placed in `child_process_log_directory`
which defaults to 
+In order the increase the robustness of the scheduler, DAGS our now processed in their own
process. Therefore each
+DAG has its own log file for the scheduler. These are placed in `child_process_log_directory`
which defaults to
 `<AIRFLOW_HOME>/scheduler/latest`. You will need to make sure these log files are removed.
 
 > DAG logs or processor logs ignore and command line settings for log file locations.
@@ -72,7 +75,7 @@ Previously the command line option `num_runs` was used to let the scheduler
term
 loops. This is now time bound and defaults to `-1`, which means run continuously. See also
num_runs.
 
 #### num_runs
-Previously `num_runs` was used to let the scheduler terminate after a certain amount of loops.
Now num_runs specifies 
+Previously `num_runs` was used to let the scheduler terminate after a certain amount of loops.
Now num_runs specifies
 the number of times to try to schedule each DAG file within `run_duration` time. Defaults
to `-1`, which means try
 indefinitely. This is only available on the command line.
 
@@ -85,7 +88,7 @@ dags are not being picked up, have a look at this number and decrease it
when ne
 
 #### catchup_by_default
 By default the scheduler will fill any missing interval DAG Runs between the last execution
date and the current date.
-This setting changes that behavior to only execute the latest interval. This can also be
specified per DAG as 
+This setting changes that behavior to only execute the latest interval. This can also be
specified per DAG as
 `catchup = False / True`. Command line backfills will still work.
 
 ### Faulty Dags do not show an error in the Web UI
@@ -109,33 +112,33 @@ convenience variables to the config. In case your run a sceure Hadoop
setup it m
 required to whitelist these variables by adding the following to your configuration:
 
 ```
-<property> 
+<property>
      <name>hive.security.authorization.sqlstd.confwhitelist.append</name>
      <value>airflow\.ctx\..*</value>
 </property>
 ```
 ### Google Cloud Operator and Hook alignment
 
-All Google Cloud Operators and Hooks are aligned and use the same client library. Now you
have a single connection 
+All Google Cloud Operators and Hooks are aligned and use the same client library. Now you
have a single connection
 type for all kinds of Google Cloud Operators.
 
 If you experience problems connecting with your operator make sure you set the connection
type "Google Cloud Platform".
 
-Also the old P12 key file type is not supported anymore and only the new JSON key files are
supported as a service 
+Also the old P12 key file type is not supported anymore and only the new JSON key files are
supported as a service
 account.
-  
+
 ### Deprecated Features
-These features are marked for deprecation. They may still work (and raise a `DeprecationWarning`),
but are no longer 
+These features are marked for deprecation. They may still work (and raise a `DeprecationWarning`),
but are no longer
 supported and will be removed entirely in Airflow 2.0
 
 - Hooks and operators must be imported from their respective submodules
 
-  `airflow.operators.PigOperator` is no longer supported; `from airflow.operators.pig_operator
import PigOperator` is. 
+  `airflow.operators.PigOperator` is no longer supported; `from airflow.operators.pig_operator
import PigOperator` is.
   (AIRFLOW-31, AIRFLOW-200)
 
 - Operators no longer accept arbitrary arguments
 
-  Previously, `Operator.__init__()` accepted any arguments (either positional `*args` or
keyword `**kwargs`) without 
+  Previously, `Operator.__init__()` accepted any arguments (either positional `*args` or
keyword `**kwargs`) without
   complaint. Now, invalid arguments will be rejected. (https://github.com/apache/incubator-airflow/pull/1285)
 
 ### Known Issues

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/6825d97b/docs/configuration.rst
----------------------------------------------------------------------
diff --git a/docs/configuration.rst b/docs/configuration.rst
index 838bc09..e68a341 100644
--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@@ -83,7 +83,7 @@ within the metadata database. The ``crypto`` package is highly recommended
 during installation. The ``crypto`` package does require that your operating
 system have libffi-dev installed.
 
-If ``crypto`` package was not installed initially, you can still enable encryption for 
+If ``crypto`` package was not installed initially, you can still enable encryption for
 connections by following steps below:
 
 1. Install crypto package ``pip install apache-airflow[crypto]``
@@ -94,17 +94,17 @@ connections by following steps below:
     from cryptography.fernet import Fernet
     fernet_key= Fernet.generate_key()
     print(fernet_key) # your fernet_key, keep it in secured place!
-    
-3. Replace ``airflow.cfg`` fernet_key value with the one from step 2. 
+
+3. Replace ``airflow.cfg`` fernet_key value with the one from step 2.
 Alternatively, you can store your fernet_key in OS environment variable. You
-do not need to change ``airflow.cfg`` in this case as AirFlow will use environment 
+do not need to change ``airflow.cfg`` in this case as AirFlow will use environment
 variable over the value in ``airflow.cfg``:
 
 .. code-block:: bash
-  
+
   # Note the double underscores
   EXPORT AIRFLOW__CORE__FERNET_KEY = your_fernet_key
- 
+
 4. Restart AirFlow webserver.
 5. For existing connections (the ones that you had defined before installing ``airflow[crypto]``
and creating a Fernet key), you need to open each connection in the connection admin UI, re-type
the password, and save it.
 
@@ -219,7 +219,8 @@ try to use ``S3Hook('MyS3Conn')``.
 In the Airflow Web UI, local logs take precedance over remote logs. If local logs
 can not be found or accessed, the remote logs will be displayed. Note that logs
 are only sent to remote storage once a task completes (including failure). In other
-words, remote logs for running tasks are unavailable.
+words, remote logs for running tasks are unavailable. Logs are stored in the log
+folder as ``{dag_id}/{task_id}/{execution_date}/{try_number}.log``.
 
 Scaling Out on Mesos (community contributed)
 ''''''''''''''''''''''''''''''''''''''''''''


Mime
View raw message