arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neal Richardson <neal.p.richard...@gmail.com>
Subject Re: Running plasma_store_server (in background) on each Spark worker node
Date Wed, 10 Jun 2020 22:40:47 GMT
Hi Tanveer,
Do you have any specific questions, or have you encountered trouble with
your setup?

Neal

On Wed, Jun 10, 2020 at 2:23 PM Tanveer Ahmad - EWI <T.Ahmad@tudelft.nl>
wrote:

> Hi all,
>
> I want to run an external command (plasma_store_server -m 3000000000 -s
> /tmp/store0 &) in the background on each worker node of my Spark cluster
> <https://userinfo.surfsara.nl/systems/cartesius/software/spark>. So that
> that external process should be running during the whole Spark job.
>
> The plasma_store_server process is used for storing and retrieving Apache
> Arrow data in Apache Spark.
>
> I am using PySpark for Spark programming and SLURM for Spark cluster
> <https://userinfo.surfsara.nl/systems/cartesius/software/spark> creation.
>
> Any help will be highly appreciated!
> Regards,
>
> Tanveer Ahmad
>
>

Mime
View raw message