spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From staslos <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-5657][Examples][PySpark] Add PySpark Av...
Date Fri, 06 Feb 2015 20:37:14 GMT
GitHub user staslos opened a pull request:

    https://github.com/apache/spark/pull/4434

    [SPARK-5657][Examples][PySpark] Add PySpark Avro Output Format example

    There is an Avro Input Format example that shows how to read Avro data in PySpark, but
nothing shows how to write from PySpark to Avro. The main challenge, a Converter needs an
Avro schema to build a record, but current Spark API doesn't provide a way to supply extra
parameters to custom converters. Provided workaround is possible.
    https://issues.apache.org/jira/browse/SPARK-5657

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/staslos/spark PySpark_Avro_Output_Format_example_Spark_1.3.0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4434.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4434
    
----
commit ef026be7981c6d892e2d2e35e8b100c9def2dd6a
Author: Stanislav Los <stanislav@magnetic.com>
Date:   2015-02-06T20:33:59Z

    SPARK-5657 Add PySpark Avro Output Format example

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message