spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renyi Xiong <renyixio...@gmail.com>
Subject pyspark worker concurrency
Date Sun, 07 Feb 2016 02:27:38 GMT
Hi,

is it a good idea to have 2 threads in pyspark worker? -  main thread
responsible for receive and send data over socket while the other thread is
calling user functions to process data?

since CPU is idle (?) during network I/O, this should improve concurrency
quite a bit.

can expert answer the question? what are the pros and cons here?

thanks,
Renyi.

Mime
View raw message