reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Yang <johnya...@gmail.com>
Subject Re: new runtime: stand-alone distributed runtime?
Date Thu, 31 Dec 2015 08:14:52 GMT
I believe when a SSH connection dies, the program (foreground) running
through it also dies? Also there's no such thing as preemption in a
standalone mode. So yeah I think that makes sense.


I'm actually helping an intern in working on this. So please allow me to
ask some more questions.

   - In which directory to scp the jars/resources?
      - /tmp comes to my mind
   - How shall we retrieve the Evaluator logs?
      - Somehow redirect stdout/stderr through the SSH connection to the
      Driver to be stored on its node (in a similar format as the
      REEF_LOCAL_RUNTIME folder)
   - How shall we teach the SSH runtime the available hosts?
      - Maybe we allow Tang configuration of List<String> that represent a
      list of ip addresses


Thanks,
John


On Thu, Dec 31, 2015 at 4:53 PM, Markus Weimer <markus@weimo.de> wrote:

> On 2015-12-30 23:37, John Yang wrote:
>
>> Regarding option 2, how do you plan to retrieve the exit status of an
>> Evaluator? REEF relies on the underlying resource manager layer to
>> report unclean Evaluator exits(e.g., failure, preemption).
>>
>
> Excellent question. I haven't thought about it. While all machines are
> running and have life SSH connections, this should be "easy". It becomes
> tricky of the connections are interrupted. At that time, a SSH runtime
> could just declare the Evaluator `failed`, right?
>
> Markus
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message