beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ankur Goenka (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-3189) Python Fnapi - Worker speedup
Date Tue, 14 Nov 2017 23:05:01 GMT

     [ https://issues.apache.org/jira/browse/BEAM-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ankur Goenka updated BEAM-3189:
-------------------------------
    Description: 
Beam Python SDK is couple of magnitude slower than Java SDK when it comes to stream processing.
There are two related issues:
# Given a single core, currently we are not fully utilizing the core because the single thread
spends a lot of time on the IO. This is more of a limitation of our implementation rather
than a limitation of Python.
# Given a machine with multiple cores, single Python process could only utilize one core.

In this task we will only address 1. 2 will be good for future optimization.


  was:
Python post commits are failing because the runner harness is not compatible with the sdk
harness.

We need a new runner harness compatible with: https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863


> Python Fnapi - Worker speedup
> -----------------------------
>
>                 Key: BEAM-3189
>                 URL: https://issues.apache.org/jira/browse/BEAM-3189
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-harness
>    Affects Versions: 2.3.0
>            Reporter: Ankur Goenka
>            Assignee: Ankur Goenka
>            Priority: Minor
>              Labels: performance, portability
>
> Beam Python SDK is couple of magnitude slower than Java SDK when it comes to stream processing.
> There are two related issues:
> # Given a single core, currently we are not fully utilizing the core because the single
thread spends a lot of time on the IO. This is more of a limitation of our implementation
rather than a limitation of Python.
> # Given a machine with multiple cores, single Python process could only utilize one core.
> In this task we will only address 1. 2 will be good for future optimization.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message