Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 09E62179A0 for ; Fri, 10 Oct 2014 17:42:35 +0000 (UTC) Received: (qmail 4016 invoked by uid 500); 10 Oct 2014 17:42:34 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 3977 invoked by uid 500); 10 Oct 2014 17:42:34 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 3959 invoked by uid 99); 10 Oct 2014 17:42:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Oct 2014 17:42:34 +0000 Date: Fri, 10 Oct 2014 17:42:34 +0000 (UTC) From: "Karthik Kambatla (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2183) Cleaner service for cache manager MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167175#comment-14167175 ] Karthik Kambatla commented on YARN-2183: ---------------------------------------- Thanks for updating the patch, Chris. Review comments on the current patch: # YarnConfiguration - let us include units for time in the config names; period-mins, initial-delay-mins, resource-sleep-ms. # CleanerService ## Don't need a SHUTDOWN_HOOK_PRIORITY specific to CleanerService, it is not being used anywhere either. ## Nit: In the constructor, can we explicitly initialize AtomicBoolean(false). ## Nit: Rename scheduler to executor or scheduledExecutor? ## serviceStart: the exception thrown should have a message more amenable to the user. How about appending "It appears there is another CleanerService running in the cluster"? ## serviceStop: the log message that the background thread stopped should be moved to the else-block in the try, instead of after catch? {code} try { if (!scheduler.awaitTermination(10, TimeUnit.SECONDS)) { LOG.warn("Gave up waiting for the cleaner task to shutdown."); } } catch (InterruptedException e) { LOG.warn("The cleaner service was interrupted while shutting down the task.", e); } LOG.info("The background thread stopped."); {code} ## runCleanerTask: Instead of checking if there is a scheduled-cleaner-task running here, why not just rely on the check in CleanerTask#run(). Agree, we might be doing a little more work here unnecessarily, but not sure the savings are worth an extra check and an extra parameter in the CleanerTask constructor. ## How does a user use runCleanerTask? Instantiate another SCM? The SCM isn't listening to any requests. I can see the SCM being run in the RM, and one could potentially add "yarn rmadmin -clean-shared-cache". In any case, given there is no way to reach a running SCM, I would remove runCleanerTask altogether for now, and add it back later when we need it? Thoughts? ## Should we worry about users starting SCMs with roots at different levels that can lead to multiple cleaners? #CleanerTask: ## Nit: RENAMED_SUFFIX can be private ## Add @Override to run() ## The following code is duplicated in InMemoryStateStore a well. May be, we should just add a static method to SharedCacheUtil? {code} StringBuilder pattern = new StringBuilder(); for (int i = 0; i < nestedLevel; i++) { pattern.append("*/"); } pattern.append("*"); {code} ## process: the sleep between directories is in millis, do we want to really calculate nanos? ## Should cleanResourceReferences be moved to SCMStore? ## For the race condition (YARN-2663), would it help to handle the delete files on HDFS in the store#remove? # CleanerMetrics: ## Make initSingleton private and call it in getInstance if the instance is null? ## How about using MutableRate or MutableStat for the rates? # Do we need CleanerMetricsCollector, wouldn't CleanerMetrics extending MetricsSource suffice? > Cleaner service for cache manager > --------------------------------- > > Key: YARN-2183 > URL: https://issues.apache.org/jira/browse/YARN-2183 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Chris Trezzo > Assignee: Chris Trezzo > Attachments: YARN-2183-trunk-v1.patch, YARN-2183-trunk-v2.patch, YARN-2183-trunk-v3.patch, YARN-2183-trunk-v4.patch > > > Implement the cleaner service for the cache manager along with metrics for the service. This service is responsible for cleaning up old resource references in the manager and removing stale entries from the cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)