Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E2E199F39 for ; Fri, 9 Mar 2012 21:01:23 +0000 (UTC) Received: (qmail 89683 invoked by uid 500); 9 Mar 2012 21:01:23 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 89645 invoked by uid 500); 9 Mar 2012 21:01:23 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 89636 invoked by uid 99); 9 Mar 2012 21:01:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 21:01:23 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2012 21:01:19 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B75B616AA4 for ; Fri, 9 Mar 2012 21:00:58 +0000 (UTC) Date: Fri, 9 Mar 2012 21:00:58 +0000 (UTC) From: "Robert Joseph Evans (Commented) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1554577958.45465.1331326858752.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1579832110.24136.1330985637425.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-3973) [Umbrella JIRA] JobHistoryServer performance improvements in YARN+MR MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226450#comment-13226450 ] Robert Joseph Evans commented on MAPREDUCE-3973: ------------------------------------------------ I have been thinking about how to speed up the job history server, and also how we are going to be able to cache all of the jobs that we have running on some of our 4000+ nodes clusters were we can run in excess of 60000 jobs a day. It seems to me that the job history server is already complex and is going to get to be even more complicated going forward as we try to add in yet another cache MAPREDUCE-3966 and possibly one more on top of that MAPREDUCE-3755. What is more all of these caches are backed by files in HDFS. We need to keep all of these caches consistent with one another, and with the operations that are going to happen to the files on HDFS, and do it with out having a single huge lock MAPREDUCE-3972. I really think that we could remove the vast majority of this complexity by changing how we cache this data. I would like to propose that we abstract away how we cache the data with a pluggable data access layer API. The default implementation of this API would store the data in an embedded version of Derby (http://db.apache.org/derby) or some other embedded SQL database that fits our licensing requirements. Because of the plug-ability it would allow users to replace derby with MySQL, Oracle, or even H-Base, with minimal effort. I think that this would drastically reduce the size and complexity of our code, it would speed up the code and web service a lot and it would open up the potential for us to move to a truly stateless history server. I would love to get something like this in on 0.23.3 instead of having to do much of the other work that has been suggested. I am not tied to this approach nor to this timeframe. I am looking for feedback on the idea before I file a JIRA for it. I know that there are a lot of potential issues here. When using a database to store the data we will now need to provide a mapping between the entries in the object and database tables, or serialize objects into blobs if we do not want to query on them. We are also going to have to eventually think about how we migrate the schema of the database from one version to another, if we want to keep the data in there long term. In the short term we can treat the DB as a true cache and blow it away each time the history server reboots. > [Umbrella JIRA] JobHistoryServer performance improvements in YARN+MR > -------------------------------------------------------------------- > > Key: MAPREDUCE-3973 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3973 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver, mrv2 > Affects Versions: 0.23.0 > Reporter: Vinod Kumar Vavilapalli > > Few parallel efforts are happening w.r.t improving/fixing issues with JobHistoryServer in MR over YARN. This is the umbrella ticket so we have the complete picture. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira