hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
Date Wed, 29 Feb 2012 22:51:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219627#comment-13219627

Robert Joseph Evans commented on MAPREDUCE-3944:

There are two things that I can think of that can help make this faster.  The first one is
to move as much filtering for the web service up to use PartialJob where possible.  This will
allow us to not load as many full jobs.  The next one is that we need to fix the locking on
getJob, so that we can be loading more then one job at a time.  we need to some how lock on
the JobID and not on the HistoryServer.

We probably also want to investigate how we are doing caching.  If we are using LRU to remove
entries the current code will loop through all entries.  This is the worst case for LRU possible
as if the cache is small it will force us load all entries unless we are lucky and the ones
at the start of the list are in the cache.  It would be preferable to have something that
would allow us to set what we are doing and expunge them randomly instead of LRU in this one
case.  We probably also want to warn people about bad configuration settings when using the
web service.
> JobHistory web services are slower then the UI and can easly overload the JH
> ----------------------------------------------------------------------------
>                 Key: MAPREDUCE-3944
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.1, 0.23.2
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>            Priority: Blocker
> When our first customer started using the Job History web services today the History
Server ground to a halt.  We found 250 Jetty threads stuck on the following stack trace.
> {noformat}
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898)
>         - waiting to lock <0x00002aaab364ba60> (a org.apache.hadoop.mapreduce.v2.hs.JobHistory)
>         at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188)
> {noformat}
> HsWebServices.java:188 corresponds to the /mapreduce/jobs service.
> Looking at the code there are a number of optimizations that need to be done to improve
its performance.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message