arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tanveer Ahmad - EWI <T.Ah...@tudelft.nl>
Subject Running plasma_store_server (in background) on each Spark worker node
Date Wed, 10 Jun 2020 21:22:51 GMT
Hi all,

I want to run an external command (plasma_store_server -m 3000000000 -s /tmp/store0 &)
in the background on each worker node of my Spark cluster<https://userinfo.surfsara.nl/systems/cartesius/software/spark>.
So that that external process should be running during the whole Spark job.

The plasma_store_server process is used for storing and retrieving Apache Arrow data in Apache
Spark.

I am using PySpark for Spark programming and SLURM for Spark cluster<https://userinfo.surfsara.nl/systems/cartesius/software/spark>
creation.

Any help will be highly appreciated!

Regards,

Tanveer Ahmad


Mime
View raw message