spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sethah <>
Subject Re: Does feature parity exist between Spark and PySpark
Date Wed, 07 Oct 2015 16:29:07 GMT
Regarding features, the general workflow for the Spark community when adding
new features is to first add them in Scala (since Spark is written in
Scala). Once this is done, a Jira ticket will be created requesting that the
feature be added to the Python API (example -  SPARK-9773
<>  ). Some of these Python
API tickets get done very quickly, some don't. As such, the Scala API will
always be more feature rich from a Spark perspective, while the Python API
can lag behind in some cases. In general, the intent is to make the PySpark
API contain all features of the Scala API, since Python is considered a
first class citizen in the Spark community; the difference is that if you
need the latest and greatest and need it right away, Scala is the best

Regarding performance, others have said it very eloquently:

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message