spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <felixcheun...@hotmail.com>
Subject Re: PySpark: preference for Python 2.7 or Python 3.5?
Date Fri, 02 Sep 2016 07:47:08 GMT
There is an Anaconda parcel one could readily install on CDH

https://docs.continuum.io/anaconda/cloudera

As Sean says it is Python 2.7.x.

Spark should work for both 2.7 and 3.5.

_____________________________
From: Sean Owen <sowen@cloudera.com<mailto:sowen@cloudera.com>>
Sent: Friday, September 2, 2016 12:41 AM
Subject: Re: PySpark: preference for Python 2.7 or Python 3.5?
To: Ian Stokes Rees <ijstokes@continuum.io<mailto:ijstokes@continuum.io>>
Cc: user @spark <user@spark.apache.org<mailto:user@spark.apache.org>>


Spark should work fine with Python 3. I'm not a Python person, but all else equal I'd use
3.5 too. I assume the issue could be libraries you want that don't support Python 3. I don't
think that changes with CDH. It includes a version of Anaconda from Continuum, but that lays
down Python 2.7.11. I don't believe there's any particular position on 2 vs 3.

On Fri, Sep 2, 2016 at 3:56 AM, Ian Stokes Rees <ijstokes@continuum.io<mailto:ijstokes@continuum.io>>
wrote:
I have the option of running PySpark with Python 2.7 or Python 3.5.  I am fairly expert with
Python and know the Python-side history of the differences.  All else being the same, I have
a preference for Python 3.5.  I'm using CDH 5.8 and I'm wondering if that biases whether I
should proceed with PySpark on top of Python 2.7 or 3.5.  Opinions?  Does Cloudera have an
official (or unofficial) position on this?

Thanks,

Ian
_______________________________
Ian Stokes-Rees
Computational Scientist

[Continuum Analytics]<http://continuum.io>
@ijstokes [Twitter] <http://twitter.com/ijstokes>  [LinkedIn] <http://linkedin.com/in/ijstokes>
 [Github] <http://github.com/ijstokes>  617.942.0218




Mime
View raw message