airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: New book covers Airflow with PySpark: Agile Data Science 2.0 (O'Reilly, 2017) AND Airflow Meetup?
Date Fri, 20 Jan 2017 02:46:30 GMT
Siddharth, nice to hear from you. Great to hear!

I'm just starting a consultancy called Data Syndrome around the book, and I
work from home, which doesn't put me in a great position to personally host
the meetup. If you need someone to organize it and to seek a venue, I can
do that. How does that sound? I'm sure I could find someone to host it.

When would be a good date, do you think? Late February?

On Thu, Jan 19, 2017 at 5:19 PM, siddharth anand <sanand@apache.org> wrote:

> Sounds like a great idea. We are looking for someone to host the next one..
> once one is announced, you can sign up as a speaker.. You are also welcome
> to host a meet-up if you like.
> -s
>
> On Thu, Jan 19, 2017 at 4:39 PM, Russell Jurney <russell.jurney@gmail.com>
> wrote:
>
> > Hello! My name is Russell Jurney. I am a relatively new Airflow user and
> > just joined the group. I am an Azkaban refugee, and an enemy of Oozie and
> > the tyranny of XML.
> >
> > I wanted to tell you about my new book, out in pre-release, called Agile
> > Data Science 2.0 <http://bit.ly/agile_data_science> (O'Reilly 2017). In
> > the
> > book, we use Airflow in chapter 2, Setup, in a way similar to the Airflow
> > tutorial. Then, in chapter 8, Deploying Predictive Systems, we use
> Airflow
> > to deploy a predictive system built with PySpark and Spark MLlib.
> >
> > Some highlights in the code at http://github.com/rjurney/
> Agile_Data_Code_2
> > :
> >
> >    - ch02/airflow_test.py
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/
> > ch02/airflow_test.py>
> > is
> >    a complete Airflow/PySpark tutorial along with
> ch02/pyspark_task_one.py
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/
> > ch02/pyspark_task_one.py>
> > and
> >    ch02/pyspark_task_two.py
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/
> > ch02/pyspark_task_two.py>
> >    - The airflow setup for chapter 8 is at ch08/airflow/setup.py
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/
> > ch08/airflow/setup.py>
> >    .
> >    - The scripts that it operates on are in ch08/
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/ch08> and
> > show
> >    things like how to use '{{ ds }}' and other parameters to hook your
> > scripts
> >    into 'airflow backfill' and other features.
> >    - ch08/make_predictions.py
> >    <https://github.com/rjurney/Agile_Data_Code_2/blob/master/
> > ch08/make_predictions.py>
> > shows
> >    how to setup a PySpark environment in a script in a way that can work
> > with
> >    Airflow.
> >
> > If there is any interest, I would love to present on something like
> > "Building Predictive Systems with Spark and Airflow" at an upcoming
> Airflow
> > meetup.
> >
> > Thanks!
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com relato.io
> >
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com relato.io

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message