reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <>
Subject Re: new runtime: stand-alone distributed runtime?
Date Thu, 31 Dec 2015 17:26:56 GMT
On 2015-12-31 00:14, John Yang wrote:
> So please allow me to ask some more questions.

Go ahead. But understand that I've not given this nearly as much thought
as you :-)

> - In which directory to scp the jars/resources? - /tmp comes to my
> mind

We could make that configurable, and default to something like
`./REEF_SSH/` with the same naming scheme as for the local runtime.
Using a relative path puts it into the user's $HOME directory for easy
access. And using the local runtime's file system layout ensures that in
the (common) case of NFS-mounted $HOME, we don't produce name clashes.

Actually, for NFS mounted $HOME, we could even optimize the `scp` round
away. But that is for later stages :)

> - How shall we retrieve the Evaluator logs? - Somehow redirect
> stdout/stderr through the SSH connection to the Driver to be stored
> on its node

That would put a lot of load on the Driver machine's network connection.
On the plus side, it makes the logs instantly available.

Another alternative, that reverses these benefits, would be to write the 
logs local on the Evaluator side, and then scp them to the Driver at the 
end. And from there, we could scp them to the Client after the Driver 
exits. This has longer waits for the logs to be available, but is likely 
more robust and efficient.

> - How shall we teach the SSH runtime the available hosts? - Maybe we
> allow Tang configuration of List<String> that represent a list of ip
> addresses

I'd use a `Set`, as the order likely doesn't matter. We also need a
username for the ssh connections. For a first cut, I believe it is OK
for us to assume that public keys have been set up for authentication.


View raw message