beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo Estrada (JIRA)" <>
Subject [jira] [Created] (BEAM-1442) Performance improvement of the Python DirectRunner
Date Thu, 09 Feb 2017 01:04:41 GMT
Pablo Estrada created BEAM-1442:

             Summary: Performance improvement of the Python DirectRunner
                 Key: BEAM-1442
             Project: Beam
          Issue Type: Improvement
          Components: sdk-py
            Reporter: Pablo Estrada
            Assignee: Ahmet Altay

The DirectRunner for Python and Java are intended to act as policy enforcers, and correctness
checkers for Beam pipelines; but there are users that run data processing tasks in them.
Currently, the Python Direct Runner has less-than-great performance, although some work has
gone into improving it. There are more opportunities for improvement.

Skills for this project:
* Python
* Cython (nice to have)
* Working through the Beam getting started materials (nice to have)

To start figuring out this problem, it is advisable to run a simple pipeline, and study the
`` and `` methods. Ask questions directly on JIRA.

This message was sent by Atlassian JIRA

View raw message