hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3944) JobHistory web services are slower then the UI and can easly overload the JH
Date Thu, 01 Mar 2012 20:43:58 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220342#comment-13220342
] 

Siddharth Seth commented on MAPREDUCE-3944:
-------------------------------------------

bq. There are two things that I can think of that can help make this faster. The first one
is to move as much filtering for the web service up to use PartialJob where possible. This
will allow us to not load as many full jobs. The next one is that we need to fix the locking
on getJob, so that we can be loading more then one job at a time. we need to some how lock
on the JobID and not on the HistoryServer.
+1 for these two changes. JobHistory used to have a method which did filtering on some parameters.
That was removed a while ago - since it was never used. You may want to pull out the history
for JobHistory.java.

Considering the webservice ends up building CompletedJob objects for each job that it has
to return - it can grow much bigger than the configured CompletedJobCache size and can cause
the JobHistory server to go OOM. MAPREDUCE-3755 will help towards avoiding this, and also
making this webservice call faster.
For now, I'd propose having a hard limit on the number of entries this webservice can return.
Also, I'd prefer having a webservice to return jobIds (based on the specified filters), instead
of returning completed job info. The jobs/{jobId} webservice can then be used to pull job
info for each job.
The order of results returned can be problematic as well. The webservice doesn't have an order
parameter - and ends up depending on whatever order the history server returns, which can
change in the future.
                
> JobHistory web services are slower then the UI and can easly overload the JH
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3944
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3944
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.1, 0.23.2
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>            Priority: Blocker
>
> When our first customer started using the Job History web services today the History
Server ground to a halt.  We found 250 Jetty threads stuck on the following stack trace.
> {noformat}
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:898)
>         - waiting to lock <0x00002aaab364ba60> (a org.apache.hadoop.mapreduce.v2.hs.JobHistory)
>         at org.apache.hadoop.mapreduce.v2.hs.webapp.HsWebServices.getJobs(HsWebServices.java:188)
> {noformat}
> HsWebServices.java:188 corresponds to the /mapreduce/jobs service.
> Looking at the code there are a number of optimizations that need to be done to improve
its performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message