reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Chung (JIRA)" <>
Subject [jira] [Commented] (REEF-1250) Memory leak in Evaluators
Date Wed, 04 May 2016 22:24:12 GMT


Andrew Chung commented on REEF-1250:

[~MariiaMykhailova] [~markus.weimer] Note that an Evaluator that sends a heartbeat after sending
a {{FailedEvaluator}} or {{DoneEvaluator}} message due to a bug may trigger a {{RuntimeException}}
if it is not in {{Evaluators}}, if we remove the Evaluator immediately after it is finished.
An example where this may happen is REEF-1374. As of now, an Evaluator that is {{DONE}} is
still kept in {{Evaluators}}, so the {{EvaluatorManager}} is fetched and the heartbeat with
the {{FailedTask}} is subsequently ignored.

Personally, I think the best fix is to only remove the Evaluator from {{Evaluators}} after
the Resource Manager tells us that the Evaluator is done, rather than after our Evaluator
sends a {{DONE}} heartbeat.

> Memory leak in Evaluators
> -------------------------
>                 Key: REEF-1250
>                 URL:
>             Project: REEF
>          Issue Type: Bug
>          Components: REEF Driver
>            Reporter: Markus Weimer
>            Assignee: Mariia Mykhailova
>            Priority: Minor
> In {{Evaluators}}, we keep track of all the Evaluators that ever existed. Including the
ones that have failed or been returned. For very long running Drivers, this is a memory leak.

This message was sent by Atlassian JIRA

View raw message