airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (AIRFLOW-2420) Add functionality for Azure Data Lake
Date Tue, 15 May 2018 17:32:00 GMT


ASF subversion and git services commented on AIRFLOW-2420:

Commit 7c233179e91818bd641b283934a73cc84a51ca03 in incubator-airflow's branch refs/heads/master
from []
[;h=7c23317 ]

[AIRFLOW-2420] Azure Data Lake Hook

Add AzureDataLakeHook as a first step to enable
Airflow connect to
Azure Data Lake.

The hook has a simple interface to upload and
download files with all
parameters available in Azure Data Lake sdk and
also a check_for_file
to query if a file exists in data lake.

[AIRFLOW-2420] Add functionality for Azure Data

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
0) issues and references them in the PR title.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
       This PR creates Azure Data Lake hook
(adl_hook.AdlHook) and all the setup required to
create a new Azure Data Lake connection.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
       Adds tests to in

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3333 from marcusrehm/master

> Add functionality for Azure Data Lake
> -------------------------------------
>                 Key: AIRFLOW-2420
>                 URL:
>             Project: Apache Airflow
>          Issue Type: New Feature
>          Components: hooks
>            Reporter: Marcus Rehm
>            Assignee: Marcus Rehm
>            Priority: Major
>             Fix For: 2.0.0
> Currently Airflow has a hook for Azure Blob Storage but it does not support Azure Data
> As a first step a hook would interface with Azure Data Lake via the Python SDK over the
adl protocol.
> The hook would have a simple interface to upload and download files with all parameters
available in ADL sdk and also a check for file to query if a file exists in the data lake.
This last functions will enable sensors development in the future.

This message was sent by Atlassian JIRA

View raw message