Thanks, had no idea!

On Tue, Mar 02, 2021 at 12:00 PM, Micah Kornfield <= span dir=3D"ltr"><emkornfield@gmail.com> wrote:

Hi Sam,
I think the la= ck of responses might be because Plasma is not being actively maintained.= =C2=A0 The original authors have forked it into the Ray project.=C2=A0

I'm sorry I don't have the expertise to answer your = questions.

-Micah

On Mon, Mar 1, 2021 at 6:48 PM Sam Shleifer <s= shleifer@gmail.com> wrote:
Partial answers are super=C2=A0helpful!
I'm happy to break this up i= f it's too much for 1 question @moderators=C2=A0
Sam

On Sat, Feb= 27, 2021 at 1:27 PM, Sam Shleifer <sshleifer@gmail.com> wr= ote:
=
Hi!
I am trying to use plasma store to reduce the memory us= age of a pytorch dataset/dataloader combination, and had 4=C2=A0 questions.= I don=E2=80=99t think any of them require pytorch knowledge. If you prefer= to comment inline there is a quip with identical content and prettier form= atting here https://quip.com/3mwGAJ9KR= 2HT

*1)* My script starts the plasma-store from= python with 200 GB:

=
nbytes =3D (1024 ** 3) * 200
_server =3D subprocess.Pope= n(["plasma_store", "-m", str(nbytes), "-s", path])<= br/>
where nbytes is chosen arb= itrarily. From my experiments it seems that one should start the store as l= arge as possible within the limits of dev/shm . I wanted to verify whether = this is actually the best practice (it would be hard for my app to know the= storage needs up front) and also whether there is an automated way to figu= re out how much storage to allocate.

*2)* Does plas= ma store support simultaneous reads? My code, which has multiple clients al= l asking for the 6 arrays from the plasma-store thousands of times, was seg= faulting with different errors, e.g.
Check failed: RemoveFromClientObjectIds(object_id, entry, client= ) =3D=3D 1
until I added a= lock around my client.get

if self.use_lock: # Fix = segfault
=C2=A0=C2=A0=C2= =A0 with FileLock("/tmp/plasma_lock"):
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ret =3D sel= f.client.get(self.object_id)
else:
=C2=A0=C2=A0=C2= =A0 ret =3D self.client.get(self.object_id)

which f= ixes.

Here is a full traceback of the failure witho= ut the lock https:/= /gist.github.com/sshleifer/75145ba828fcb4e998= d5e34c46ce13fc
Is this= expected behavior?

<= /div>
*3)* Is there a simple way to a= dd many objects to the plasma store at once? Right now, we are considering = changing,

oid =3D client.put(array)
to
oids =3D [client.put(x) for x in array]

= so that we can fetch one entry at a time. but the writes are much slower.

* 3a) Is there a lower level interface for bulk writ= es?
* 3b) Or is it recomme= nded to chunk the array and have different python processes write simultane= ously to make this faster?

*4)* Is there a way to s= ave/load the contents of the plasma-store to disk without loading everythin= g into memory and then saving it to some other format?

Replication

Setup instructions for fairseq+repl= icating the segfault:=C2=A0https://gist.github.com/sshleifer/bd6= 982b3f632f1d4bcefc9feceb30b1a
My code is here: https://gi= thub.com/pytorch/fairseq/pull/3287

Thanks!
= Sam