airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Riccomini (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (AIRFLOW-514) HiveCli hook should be able to load a pandas DataFrame
Date Thu, 17 Nov 2016 19:02:58 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Riccomini closed AIRFLOW-514.
-----------------------------------
       Resolution: Done
    Fix Version/s: Airflow 1.8

> HiveCli hook should be able to load a pandas DataFrame 
> -------------------------------------------------------
>
>                 Key: AIRFLOW-514
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-514
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Daniel Frank
>            Assignee: Daniel Frank
>            Priority: Minor
>             Fix For: Airflow 1.8
>
>
> Currently the hive cli hook can load_df, which returns a pandas.DataFrame. Many of our
workflows involve retrieving a hive table in pandas.DataFrame form, modifying it and saving
it (perhaps elsewhere). In order to save the dataframe we have to manually translate the types,
save to disk and run load_file() which is repetitive and tedious. This workflow could be easily
automated with a load_df method for HiveCliHook 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message