airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iván Robla Albarrán <ivanro...@gmail.com>
Subject Re: Help SparkJDBCOperator
Date Fri, 08 Feb 2019 06:59:03 GMT
thanks!!!

El jue., 7 feb. 2019 a las 11:39, Driesprong, Fokko (<fokko@driesprong.frl>)
escribió:

> Hi Ivan,
>
> The SparkJDBCOperator is an effort to replace Sqoop. For example, if you
> run Spark on Kubernetes, you can also use Spark to do your Sqoop workloads.
> Please keep in mind that this operator is not as rich in functionality as
> Sqoop. The original PR is given here:
> https://github.com/apache/airflow/pull/3021
>
> The PySpark code is already given here:
>
> https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/spark_jdbc_script.py
> The Operator will pass all the arguments to this script, so you won't have
> to do this yourself.
>
> You need to pass all the arguments to the operator, the PythonDoc is
> self-explanatory:
>
> https://github.com/apache/airflow/blob/master/airflow/contrib/operators/spark_jdbc_operator.py
>
> For further reference, this operator is also called the Sqark (Sql +
> Spark)Operator.
>
> Hopefully, you're less lost now. If you have any further questions, let me
> know.
>
> Cheers, Fokko
>
>
>
>
>
> Op vr 1 feb. 2019 om 13:54 schreef Iván Robla Albarrán <
> ivanrobla@gmail.com
> >:
>
> > Hi ,
> >
> > I am seaching how to substitute Apache Sqoop
> >
> > I am analyzing SparkJDBCOperator, but i dont understand how i have to
> use .
> >
> > It a version of  SparkSubmit operator, for include as conection JDBC
> > conection ?
> >
> >  I need to include Spark code?
> >
> > Any example?
> >
> > Thanks, I am very lost
> >
> > Regards,
> > Iván Robla
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message