airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "xifeng (Jira)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-6248) Should host variable in Connection class contain scheme?
Date Sat, 14 Dec 2019 03:07:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16996104#comment-16996104
] 

xifeng commented on AIRFLOW-6248:
---------------------------------

Hi Taylor, thanks for your reply. I'm still confused.

in the testcase:
{code:python}
  db.merge_conn(
            Connection(
                conn_id='spark-default', conn_type='spark',
                host='yarn://yarn-master',
                extra='{"queue": "root.etl", "deploy-mode": "cluster"}')
        )
{code}

why not write in this way, and it won't be a problem:

{code:python}
  db.merge_conn(
            Connection(
                conn_id='spark-default', conn_type='spark',
                host='yarn-master',
                extra='{"queue": "root.etl", "deploy-mode": "cluster"}')
        )
{code}

On the other hand, if we set AIRFLOW_CONN_SPARK_DEFAULT=yarn://yarn-master,  the host is 'yarn-master'.

What I mean is I think the host should have a standard.



> Should host variable in Connection class contain scheme?
> --------------------------------------------------------
>
>                 Key: AIRFLOW-6248
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6248
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: models
>    Affects Versions: 1.10.6
>            Reporter: xifeng
>            Priority: Trivial
>             Fix For: 2.0.0
>
>
> In unit test, there are many snippets like:
> {code:python}
>   db.merge_conn(
>             Connection(
>                 conn_id='spark-default', conn_type='spark',
>                 host='yarn://yarn-master',
>                 extra='{"queue": "root.etl", "deploy-mode": "cluster"}')
>         )
> {code}
> host var contains scheme("yarn://")
> However, if there is a *uri* instance var in Connection, the host of Connection would
not contain scheme. this is because *parse_from_uri *function.
> {code:python}
>   def parse_from_uri(self, uri):
>         uri_parts = urlparse(uri)
>         conn_type = uri_parts.scheme
>         if conn_type == 'postgresql':
>             conn_type = 'postgres'
>         elif '-' in conn_type:
>             conn_type = conn_type.replace('-', '_')
>         self.conn_type = conn_type
>         self.host = parse_netloc_to_hostname(uri_parts)
>         quoted_schema = uri_parts.path[1:]
>         self.schema = unquote(quoted_schema) if quoted_schema else quoted_schema
>         self.login = unquote(uri_parts.username) \
>             if uri_parts.username else uri_parts.username
>         self.password = unquote(uri_parts.password) \
>             if uri_parts.password else uri_parts.password
>         self.port = uri_parts.port
>         if uri_parts.query:
>             self.extra = json.dumps(dict(parse_qsl(uri_parts.query, keep_blank_values=True)))
> {code}
> So, should the host contain scheme? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message