airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Driesprong, Fokko" <fo...@driesprong.frl>
Subject Re: spark sql hook with multiple queries
Date Sat, 14 Oct 2017 09:07:35 GMT
Hi Boris,

Interesting. Multiple queries is supported by the spark-sql operator and
this should work using Airflow. Executing SQL from a file:

Fokkos-MBP:~ fokkodriesprong$ spark-sql --driver-java-options
"-Dlog4j.configuration=file:///tmp/log4j.properties" -f query.sql
1
Time taken: 1.976 seconds, Fetched 1 row(s)
1
Time taken: 0.034 seconds, Fetched 1 row(s)

Executing SQL from the command-line:

Fokkos-MBP:~ fokkodriesprong$ spark-sql --driver-java-options
"-Dlog4j.configuration=file:///tmp/log4j.properties" -e "SELECT 1; SELECT
1;"
1
Time taken: 1.947 seconds, Fetched 1 row(s)
1
Time taken: 0.032 seconds, Fetched 1 row(s)

Can you share the exception that you are seeing? What version of Spark are
you using?

Cheers, Fokko







2017-10-11 18:01 GMT+02:00 Boris Tyukin <boris@boristyukin.com>:

> hi guys,
>
> tried spark_sql_hook to run a multi-statement query (two queries separated
> by semi-column ) and it hangs forever. If i comment out the second query,
> it runs fine.
>
> Anyone had the same issue? i do not see anything in the code preventing
> more one statement.
>
>     sql = """
> select * from .... ;
> select * from .... ;
> """
>
>     spark = SparkSqlHook(sql, conn_id='spark_default', master='yarn',
> num_executors=4)
>     spark.run_query()
>
> Boris
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message