airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fokko Driesprong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1496) Druid hook unable to load data from hdfs
Date Sat, 10 Feb 2018 19:09:05 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359600#comment-16359600
] 

Fokko Driesprong commented on AIRFLOW-1496:
-------------------------------------------

Please try again with the new Druid hook. The hook has been refactored and now contains tests
:)

> Druid hook unable to load data from hdfs
> ----------------------------------------
>
>                 Key: AIRFLOW-1496
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1496
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hooks
>    Affects Versions: 1.8.0
>         Environment: RHEL 6.7 , Python 2.7.13
>            Reporter: Rahul Singh
>            Priority: Major
>
> Hi,
> I am trying to use druid hook to load data from hdfs to druid , below is my dag script
:
> from datetime import datetime, timedelta
> import json
> from airflow.hooks import HttpHook, DruidHook
> from airflow.operators import PythonOperator
> from airflow.models import DAG
> def check_druid_con():
>  dr_hook = DruidHook(druid_ingest_conn_id='DRUID_INDEX',druid_query_conn_id='DRUID_QUERY')
>  dr_hook.load_from_hdfs("druid_airflow","hdfs://10.xx.xx.xx/demanddata/demand2.tsv","stay_date",["channel","rate"],"2016-12-11/2017-12-13",1,-1,metric_spec=[{
"name" : "count", "type" : "count" }],hadoop_dependency_coordinates="org.apache.hadoop:hadoop-client:2.7.3")
> default_args = {
>     'owner': 'TC',
>     'start_date': datetime(2017, 8, 7),
>     'retries': 1,
>     'retry_delay': timedelta(minutes=5)
> }
> dag = DAG('druid_data_load', default_args=default_args)
> druid_task1=PythonOperator(task_id='check_druid',
>                    python_callable=check_druid_con,
>                    dag=dag)
> I keep getting error , TypeError: load_from_hdfs() takes at least 10 arguments (10 given)
. However I have given 10 arguments to load_from_hdfs , still it errors out . Please help.
> Regards
> Rahul



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message