Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7CB80108D3 for ; Fri, 21 Nov 2014 16:33:34 +0000 (UTC) Received: (qmail 3935 invoked by uid 500); 21 Nov 2014 16:33:34 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 3907 invoked by uid 500); 21 Nov 2014 16:33:34 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 3812 invoked by uid 99); 21 Nov 2014 16:33:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Nov 2014 16:33:34 +0000 Date: Fri, 21 Nov 2014 16:33:34 +0000 (UTC) From: "Ryan Williams (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (SPARK-4539) History Server counts "incomplete" applications against the "retainedApplications" total, fails to show eligible "completed" applications MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Ryan Williams created SPARK-4539: ------------------------------------ Summary: History Server counts "incomplete" applications against the "retainedApplications" total, fails to show eligible "completed" applications Key: SPARK-4539 URL: https://issues.apache.org/jira/browse/SPARK-4539 Project: Spark Issue Type: Bug Affects Versions: 1.2.0 Reporter: Ryan Williams I have observed the history server to return 0 or 1 applications from a directory that contains many complete and incomplete applications (the latter being application directories that are missing the {{APPLICATION_COMPLETE}} file). Without having dug too much, my theory is that HistoryServer is seeing the "incomplete" directories and counting them against the {{retainedApplications}} maximum but not displaying them. One supporting anecdote for this is that I loaded HS against a directory that had one complete application and nothing else, and HS worked as expected (I saw the one application in the web UI). I then copied ~100 other application directories in, the majority of which were "incomplete" (in particular, most of the ones that had the earliest timestamps), and still only saw the one original completed application via the web UI. Finally, I restarted the same server with the {{retainedApplications}} set to 1000 (instead of 50; the directory a this point had ~10 completed applications and 90 incomplete ones), and saw all/exactly the completed applications, leading me to believe that they were being "boxed out" of the maximum-50-retained-applications iteration of the history server. Silently failing on "incomplete" directories while still docking the count, if that is indeed what is happening, is a pretty confusing failure mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org