From dev-return-7582-archive-asf-public=cust-asf.ponee.io@airflow.apache.org Tue Feb 5 19:57:55 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 60388180608 for ; Tue, 5 Feb 2019 20:57:54 +0100 (CET) Received: (qmail 37266 invoked by uid 500); 5 Feb 2019 19:57:53 -0000 Mailing-List: contact dev-help@airflow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.apache.org Delivered-To: mailing list dev@airflow.apache.org Received: (qmail 37255 invoked by uid 99); 5 Feb 2019 19:57:53 -0000 Received: from mail-relay.apache.org (HELO mailrelay2-lw-us.apache.org) (207.244.88.137) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2019 19:57:53 +0000 Received: from themisto.localdomain (231.25.169.217.in-addr.arpa [217.169.25.231]) by mailrelay2-lw-us.apache.org (ASF Mail Server at mailrelay2-lw-us.apache.org) with ESMTPSA id E35EF6BBA for ; Tue, 5 Feb 2019 19:57:51 +0000 (UTC) From: Ash Berlin-Taylor Content-Type: multipart/alternative; boundary="Apple-Mail=_91764D45-4265-46F7-B595-E0604DFB24CC" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: API Reference - current confusion and improvement plan Date: Tue, 5 Feb 2019 19:57:49 +0000 References: <96B598EB-6278-4DD6-8CE3-9571EA384FDF@apache.org> To: dev@airflow.apache.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3273) --Apple-Mail=_91764D45-4265-46F7-B595-E0604DFB24CC Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I like the API reference v2 layout a lot! Much easier to navigate and = see what classes are available, for me at least Documenting modules will help somewhat with a few things but, lets say = the "AWS" section of the integration doc is across the following = modules: airflow.contrib.operators.aws_athena_operator = airflow.contrib.operators.awsbatch_operator = airflow.contrib.operators.ecs_operator = airflow.contrib.operators.emr_add_steps_operator = airflow.contrib.operators.emr_create_job_flow_operator = airflow.contrib.operators.emr_terminate_job_flow_operator = airflow.contrib.operators.s3_copy_object_operator = airflow.contrib.operators.s3_delete_objects_operator = airflow.contrib.operators.s3_list_operator = airflow.contrib.operators.s3_to_gcs_operator = airflow.contrib.operators.s3_to_gcs_transfer_operator = airflow.contrib.operators.s3_to_sftp_operator = airflow.contrib.operators.sagemaker_base_operator = airflow.contrib.operators.sagemaker_endpoint_config_operator = airflow.contrib.operators.sagemaker_endpoint_operator = airflow.contrib.operators.sagemaker_model_operator = airflow.contrib.operators.sagemaker_training_operator = airflow.contrib.operators.sagemaker_transform_operator = airflow.contrib.operators.sagemaker_tuning_operator = airflow.contrib.operators.segment_track_event_operator = airflow.operators.redshift_to_s3_operator = airflow.operators.s3_file_transform_operator = airflow.operators.s3_to_hive_operator = airflow.operators.s3_to_redshift_operator = airflow.sensors.s3_key_sensor = airflow.sensors.s3_prefix_sensor = airflow.contrib.sensors.emr_base_sensor = airflow.contrib.sensors.emr_job_flow_sensor = airflow.contrib.sensors.emr_step_sensor = And that was just before I got bored of looking for them :) >=20 > On 5 Feb 2019, at 16:25, Kamil Bregu=C5=82a = wrote: >=20 > I already have a POC: :-) >=20 > Available at: http://level-can.surge.sh/html/autoapi/index.html >=20 > I would like to point out that in addition to class documentation, you = can > also document modules. > = http://level-can.surge.sh/html/autoapi/airflow/executors/local_executor/in= dex.html > Currently, the `howto/operators.rst` file is used for this (Related = link: > = https://airflow.readthedocs.io/en/latest/howto/operator.html#cloudsqlquery= operator > ) >=20 >=20 > On Tue, Feb 5, 2019 at 5:18 PM Ash Berlin-Taylor = wrote: >=20 >>> We want to rewrite the `integration.rst` file so that it does not = contain >>> duplicates from `code.rst ' (API Reference). In the next step, = introduce >>> the reference API generation based on the source code that will = replace >> the >>> `code.rst` file. >>=20 >> :100: Yes please! >>=20 >>=20 >> Given a number of integrations are across multiple files (n = operators, and >> m hooks) my first thought is that something in integration.rst, or a = file >> elsewhere in the docs/ tree is the place to put this. >>=20 >> On epydoc vs a sphinx extension I lean very heavily towards the = sphinx >> extension, as we are already using much of sphinx. >>=20 >> Can you create a _small_ example of what you'd propse for no.4 (I = don't >> want you to do a lot of work that might be wasted) >>=20 >> -ash >>=20 >>=20 >>> On 5 Feb 2019, at 15:55, Kamil Bregu=C5=82a = >> wrote: >>>=20 >>> Hello community, >>>=20 >>> While working on the documentation for the GCP operators, my team at >>> Polidea encountered some confusion related to the structure of the >>> documentation. >>>=20 >>> Short story: >>>=20 >>> We want to rewrite the `integration.rst` file so that it does not = contain >>> duplicates from `code.rst ' (API Reference). In the next step, = introduce >>> the reference API generation based on the source code that will = replace >> the >>> `code.rst` file. >>>=20 >>> Long story: >>>=20 >>> Currently, the documentation contains two places where the = description of >>> classes related to operators is included. They are `code.rst` and >>> `integration.rst` files. >>>=20 >>> The `integration.rst` file contains information about integration, = in >>> particular for Azure: Microsoft Azure, AWS: Amazon Web Services, >>> Databricks, GCP: Google Cloud Platform, Qubole. Other integrations, >>> however, do not have descriptions. >>>=20 >>> The `code.rst` file contains =E2=80=9CAPI Reference=E2=80=9D which = contains information >>> about *all* classes including those included in the file >> `integration.rst`. >>>=20 >>> Such duplication, however, is problematic for several reasons: >>>=20 >>> 1. >>>=20 >>> Users may feel lost and may not know which section they should look >> into. >>> 2. >>>=20 >>> Changes must be made in many places which leads to = desynchronization. >>> Most often, changes are made only in the source code, so they do = not >> appear >>> in the generated documentation. >>> 3. >>>=20 >>> Linking to classes using the `class` directive for Sphinx is >>> inconclusive - if the code is embedded both in `integration.rst` = and >>> `code.rst` using the `autoclass` directive, we=E2=80=99re not sure = where the >> user >>> will be navigated. >>>=20 >>>=20 >>> There are several solutions:: >>>=20 >>> 1. >>>=20 >>> Leave it as is. Then we need to agree on which `autoclass` = directive >>> should have the `no-index` flags. >>> 2. >>>=20 >>> Delete duplicates from the `code.rst` file and add a note about the >>> `integration.rst` file in the `code.rst` file. >>> 3. >>>=20 >>> Delete duplicates from the `integration.rst` file and add a note = about >>> the `code.rst` file in the `integration.rst` file. >>> 4. >>>=20 >>> Delete information from both files and generate the API = documentation >>> always based only on the source code. This solution means that we = would >>> have to write less documentation. >>> There are ready tools that we can use: >>> 1. >>>=20 >>> epydoc - http://epydoc.sourceforge.net/ ; >>> 2. >>>=20 >>> autoapi extension for Sphinx - >> https://github.com/rtfd/sphinx-autoapi >>> ; >>> 3. >>>=20 >>> other - https://wiki.python.org/moin/DocumentationTools >>>=20 >>>=20 >>> The first, second, third solution does not solve all problems. In >>> particular, we still need to complete the `code.rst` and >> `integration.rst` >>> files. The fourth solution solves all problems, but is the most = complex. >> It >>> is worth noting that mixing solutions is possible. For example, we = can >>> delete information from the file `integration.rst` as short term = solution >>> and start working on creating another form of documentation as a = long >> term >>> solution. This is the best option in our opinion. >>>=20 >>> I=E2=80=99ve recently done a few activities related to this topic. >>>=20 >>> First, I added the noindex flag to autoclass directives for all = operators >>> in `integration.rst` file. In rare cases (If any), this caused links = that >>> were previously directed to the file `integration.rst` to be = redirected >> to >>> the `code.rst` file. Elements had to be linked using `:class:` = instead >> of a >>> section link. Each operator is included in the new section in this = file. >>>=20 >>> PR: https://github.com/apache/airflow/pull/4585 >>> >>>=20 >>> Second, I completed the `code.rst` file with the missing classes. >>>=20 >>> PR: https://github.com/apache/airflow/pull/4644 >>>=20 >>> I would like to ask which solution is the best in your opinion? What >> steps >>> should we take to make the documentation more enjoyable? >>>=20 >>> Greetings >>>=20 >>> Kamil Bregu=C5=82a >>=20 >>=20 >=20 > --=20 >=20 > Kamil Bregu=C5=82a > Polidea | Software Engineer >=20 > M: +48 505 458 451 <+48505458451> > E: kamil.bregula@polidea.com > [image: Polidea] >=20 > We create human & business stories through technology. > Check out our projects! > [image: Github] [image: Facebook] > [image: Twitter] > [image: Linkedin] > [image: Instagram] > [image: Behance] > --Apple-Mail=_91764D45-4265-46F7-B595-E0604DFB24CC--