airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-5730) Enable get_pandas_df on PinotDbApiHook
Date Tue, 17 Dec 2019 10:32:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998061#comment-16998061
] 

ASF subversion and git services commented on AIRFLOW-5730:
----------------------------------------------------------

Commit a3253b1fa99411612ed008b81a7d01f3c9486c42 in airflow's branch refs/heads/v1-10-test
from Kengo Seki
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=a3253b1 ]

[AIRFLOW-5730] Enable get_pandas_df on PinotDbApiHook (#6399)


(cherry picked from commit 8f1a585b58e6d8091f4524e6cfb09c606e828825)


> Enable get_pandas_df on PinotDbApiHook
> --------------------------------------
>
>                 Key: AIRFLOW-5730
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5730
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>    Affects Versions: 1.10.5
>            Reporter: Kengo Seki
>            Assignee: Kengo Seki
>            Priority: Major
>             Fix For: 2.0.0, 1.10.7
>
>
> Currently, DruidDbApiHook and PinotDbApiHook disable their {{get_pandas_df}} methods
by raising {{NotImplementedError}}.
> But they actually work as inherited from DbApiHook, as follows:
> {code}
> $ git diff
> diff --git a/airflow/contrib/hooks/pinot_hook.py b/airflow/contrib/hooks/pinot_hook.py
> index e617f8e9b..0864b3584 100644
> --- a/airflow/contrib/hooks/pinot_hook.py
> +++ b/airflow/contrib/hooks/pinot_hook.py
> @@ -90,8 +90,5 @@ class PinotDbApiHook(DbApiHook):
>      def set_autocommit(self, conn, autocommit):
>          raise NotImplementedError()
>  
> -    def get_pandas_df(self, sql, parameters=None):
> -        raise NotImplementedError()
> -
>      def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
>          raise NotImplementedError()
> diff --git a/airflow/hooks/druid_hook.py b/airflow/hooks/druid_hook.py
> index c3cd3cd71..e2e20f1ec 100644
> --- a/airflow/hooks/druid_hook.py
> +++ b/airflow/hooks/druid_hook.py
> @@ -158,8 +158,5 @@ class DruidDbApiHook(DbApiHook):
>      def set_autocommit(self, conn, autocommit):
>          raise NotImplementedError()
>  
> -    def get_pandas_df(self, sql, parameters=None):
> -        raise NotImplementedError()
> -
>      def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
>          raise NotImplementedError()
> {code}
> {code:title=Druid example}
> $ airflow connections list
> (snip)
> ├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
> │ 'druid_broker_default'         │ 'druid-broker'              │ 'localhost'  
            │ 8082   │ False          │ True                 │ 'gAAAAABdrxvt...M1ideRO8233QG'
│
> ╘════════════════════════════════╧═════════════════════════════╧═══════════════════════════╧════════╧════════════════╧══════════════════════╧════════════════════════════════╛
> $ ipython
> (snip)
> In [2]: from airflow.hooks.druid_hook import DruidDbApiHook                         
                                                                                        
> In [3]: DruidDbApiHook().get_pandas_df("SELECT * FROM wikipedia WHERE sum_delta >
%(num)d", {"num": 2000})                                                                 
 
> [2019-10-23 23:28:18,606] {base_hook.py:89} INFO - Using connection to: id: druid_broker_default.
Host: localhost, Port: 8082, Schema: None, Login: None, Password: None, extra: {'schema':
'http', 'endpoint': '/druid/v2/sql'}
> [2019-10-23 23:28:18,607] {druid_hook.py:140} INFO - Get the connection to druid broker
on localhost using user None
> Out[3]: 
>                        __time        channel cityName                               
            comment  ...  sum_deleted sum_delta sum_metroCode                    user
> 0    2015-09-12T00:00:00.000Z  #en.wikipedia           Archiving case from [[Wikipedia:Sockpuppet
inv...  ...            0      3360             0                   Bbb23
> 1    2015-09-12T00:00:00.000Z  #ja.wikipedia           [[Special:Contributions/119.224.209.170|119.22...
 ...            0      6853             0                 Kkairri
> 2    2015-09-12T01:00:00.000Z  #en.wikipedia                                        
    /* Hong Kong */  ...            0      4500             0                 Bertaut
> 3    2015-09-12T01:00:00.000Z  #en.wikipedia           Archiving 1 discussion(s) from
[[User talk:New...  ...            0      3599             0  Lowercase sigmabot III
> 4    2015-09-12T01:00:00.000Z  #en.wikipedia           [[WP:AES|←]]Created page with
'{{Infobox wildf...  ...            0     13335             0                  Orygun
> ..                        ...            ...      ...                               
                ...  ...          ...       ...           ...                     ...
> 851  2015-09-12T23:00:00.000Z  #pt.wikipedia                Bem-vindo (usando [[WP:H|Huggle]])
 (3.1.16)  ...            0      2588             0                Mobyduck
> 852  2015-09-12T23:00:00.000Z  #pt.wikipedia           adição de informação, renovação
de conteúdos e...  ...            0      3666             0           Templarius 01
> 853  2015-09-12T23:00:00.000Z  #ru.wikipedia           [[ВП:←|←]] Новая страница:
«{{редактирую|~~~~|...  ...            0      6766             0              
  Dulamas
> 854  2015-09-12T23:00:00.000Z  #ru.wikipedia     Tver  [[ВП:×|отмена]] правки
73302711 участника [[Sp...  ...            0      9302             0            94.241.56.71
> 855  2015-09-12T23:00:00.000Z  #sr.wikipedia           Нова страница: [[Датотека:US
Open.svg|десно|20...  ...            0     38443             0               Самарџија
> [856 rows x 21 columns]
> {code}
> {code:title=Pinot example}
> $ airflow connections list
> (snip)
> ├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
> │ 'pinot_broker_default'         │ 'pinot_broker_conn_id'      │ 'localhost'  
            │ 8000   │ False          │ True                 │ 'gAAAAABdrxRj...Afd51PZY94nfa'
│
> ├────────────────────────────────┼─────────────────────────────┼───────────────────────────┼────────┼────────────────┼──────────────────────┼────────────────────────────────┤
> $ ipython
> (snip)
> In [2]: from airflow.contrib.hooks.pinot_hook import PinotDbApiHook
> In [3]: PinotDbApiHook().get_pandas_df("select sum('runs') from baseballStats where yearID>=%(num)d
group by playerName", {"num": 2000})
> [2019-10-23 23:31:06,058] {base_hook.py:89} INFO - Using connection to: id: pinot_broker_default.
Host: localhost, Port: 8000, Schema: None, Login: None, Password: None, extra: {'endpoint':
'/query', 'schema': 'http'}
> [2019-10-23 23:31:06,059] {pinot_hook.py:48} INFO - Get the connection to pinot broker
on localhost
> select sum('runs') from baseballStats where yearID>=2000 group by playerName
> Out[3]: 
>            playerName    sum_runs
> 0              Adrian  1820.00000
> 1        Jose Antonio  1692.00000
> 2              Rafael  1565.00000
> 3       Brian Michael  1500.00000
> 4        Jose Alberto  1426.00000
> 5  Alexander Emmanuel  1426.00000
> 6     Derek Sanderson  1390.00000
> 7              Carlos  1314.00000
> 8        Johnny David  1300.00000
> 9              Ichiro  1261.00000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message