hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-6885) JobHistory event handling does not complete if handling event throws exception on shutdown
Date Sat, 06 May 2017 01:53:04 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Hung updated MAPREDUCE-6885:
-------------------------------------
    Description: 
If eventHandlingThread handles an event which causes it to throw an exception (e.g. if it
is unable to flush an event to HDFS), the thread dies. The events are enqueued and eventually
handled when JobHistoryEventHandler stops. If handling these events also throws an exception,
the remaining events are lost. This can for example cause moving job history files to mapreduce.jobhistory.done-dir
to not occur.

There should be some fail-proof logic here to prevent these events from being lost. Should
also be careful that the same exception is not thrown for each event to prevent the logs from
being cluttered with the same stacktrace. Perhaps we can set a configurable number of failed
handleEvent calls before finally giving up a clean shutdown.

  was:
If eventHandlingThread handles an event which causes it to throw an exception (e.g. if it
is unable to flush an event to HDFS), the thread dies. This thread is responsible for moving
job history files to mapreduce.jobhistory.done-dir, if an exception is thrown the files will
not be moved here, which is bad.

We should catch these exceptions so that the thread can still move these files when the job
is complete.


> JobHistory event handling does not complete if handling event throws exception on shutdown
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6885
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6885
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jonathan Hung
>
> If eventHandlingThread handles an event which causes it to throw an exception (e.g. if
it is unable to flush an event to HDFS), the thread dies. The events are enqueued and eventually
handled when JobHistoryEventHandler stops. If handling these events also throws an exception,
the remaining events are lost. This can for example cause moving job history files to mapreduce.jobhistory.done-dir
to not occur.
> There should be some fail-proof logic here to prevent these events from being lost. Should
also be careful that the same exception is not thrown for each event to prevent the logs from
being cluttered with the same stacktrace. Perhaps we can set a configurable number of failed
handleEvent calls before finally giving up a clean shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message