Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D61A7200C28 for ; Mon, 13 Mar 2017 16:38:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D4B3D160B6C; Mon, 13 Mar 2017 15:38:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 01B6B160B60 for ; Mon, 13 Mar 2017 16:38:08 +0100 (CET) Received: (qmail 58921 invoked by uid 500); 13 Mar 2017 15:38:08 -0000 Mailing-List: contact reviews-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@aurora.apache.org Delivered-To: mailing list reviews@aurora.apache.org Received: (qmail 58909 invoked by uid 99); 13 Mar 2017 15:38:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Mar 2017 15:38:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5E082C0118; Mon, 13 Mar 2017 15:38:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.249 X-Spam-Level: *** X-Spam-Status: No, score=3.249 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, KAM_LOTSOFHASH=0.25, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 6YB9QpBfo-ly; Mon, 13 Mar 2017 15:38:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id AED0F5F295; Mon, 13 Mar 2017 15:38:04 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 4F290E0146; Mon, 13 Mar 2017 15:38:04 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id 22BB3C4080A; Mon, 13 Mar 2017 15:38:04 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7771907323173452697==" MIME-Version: 1.0 Subject: Re: Review Request 56575: AURORA-1837 Improve task history pruning From: Mehrdad Nurolahzade To: Santhosh Kumar Shanmugham , David McLaughlin , Stephan Erb Cc: Aurora , Mehrdad Nurolahzade , Zameer Manji , Aurora ReviewBot Date: Mon, 13 Mar 2017 15:38:03 -0000 Message-ID: <20170313153803.3442.18066@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Mehrdad Nurolahzade X-ReviewGroup: Aurora X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/56575/ X-Sender: Mehrdad Nurolahzade References: <20170215174032.13057.48864@reviews.apache.org> In-Reply-To: <20170215174032.13057.48864@reviews.apache.org> Reply-To: Mehrdad Nurolahzade X-ReviewRequest-Repository: aurora archived-at: Mon, 13 Mar 2017 15:38:10 -0000 --===============7771907323173452697== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Feb. 15, 2017, 9:40 a.m., David McLaughlin wrote: > > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java > > Lines 134-135 (original), 134-135 (patched) > > > > > > Can you explain the reasoning behind doing this in a loop per job rather than a single loop that queries all tasks? > > Mehrdad Nurolahzade wrote: > That's how I originally did it (see revision 1). The concern expressed by reviewers (read above) was around the following areas: > 1. We are trying to load/filter all terminal state tasks at once which might cause heap pressure. Looking at one job at a time might be slow and inefficient from a throughput standpoint (which is a secondary concern) but can help with putting less pressure on the heap. > 2. `MemStore` does not have a scheduling status index therefore it has to sequentially scan all tasks to find matching tasks. > > David McLaughlin wrote: > Were those concerns backed by data? Do we have performance numbers from before and after that change? It doesn't seem to me like a given that moving from a single query to N+1 queries leads to less work and less objects being created. Especially when the tasks are already on heap (DbTaskStore is still considered experimental and not production ready. We should not optimize for it). > > For point (2) - this does not prevent us scanning every single task in the store. It just divides the full scan into multiple queries, and each of those queries (and subsequent filtering) has the overhead of objects, streams, sorted copies (yikes) and collections that are used for filtering inside the task processing loop. > > Mehrdad Nurolahzade wrote: > No, I did not verify performance/heap profile of the original version. I'm going to deploy it to a test cluster and see if there is a noticable difference in run-time behavior. > > I agree on both points regarding `MemTaskStore`. Now, I am seeing value in separating prunable task selection implementations for `MemTaskStore` and `DbTaskStore`. But, that can be addressed in a follow-up ticket. > > Stephan Erb wrote: > Were you able to run a version of this patch on your scale cluster? Any new insights? I did naive observations under a test cluster and both implementations (prune all vs prune by job) seem to work fine. However, for the final word on this, we are still waiting to have our production-like test cluster to be ready (soon) before we can verify which implementation produces the least unwanted side-effect. - Mehrdad ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56575/#review165733 ----------------------------------------------------------- On Feb. 15, 2017, 9:07 a.m., Mehrdad Nurolahzade wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56575/ > ----------------------------------------------------------- > > (Updated Feb. 15, 2017, 9:07 a.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and Stephan Erb. > > > Bugs: AURORA-1837 > https://issues.apache.org/jira/browse/AURORA-1837 > > > Repository: aurora > > > Description > ------- > > This patch addressed efficiency issues in the current implementation of `TaskHistoryPruner`. The new design is similar to that of `JobUpdateHistoryPruner`: (a) Instead of registering a `DelayExecutor` run upon terminal task state transitions, it runs on preconfigured intervals, finds all terminal state tasks that meet pruning criteria and deletes them. (b) Makes the initial task history pruning delay configurable so that it does not hamper scheduler upon start. > > The new design addressed the following two efficiecy problems: > > 1. Upon scheduler restart/failure, the in-memory state of task history pruning scheduled with `DelayExecutor` is lost. `TaskHistoryPruner` learns about these dead tasks upon restart when log is replayed. These expired tasks are picked up by the second call to `executor.execute()` that performs job level pruning immediately (i.e., without delay). Hence, most task history pruning happens after scheduler restarts and can severely hamper scheduler performance (or cause consecutive fail-overs on test clusters when we put load test on scheduler). > > 2. Expired tasks can be picked up for pruning multiple times. The asynchronous nature of `BatchWorker` which used to process task deletions introduces some delay between delete enqueue and delete execution. As a result, tasks already queued for deletion in a previous evaluation round might get picked up, evaluated and enqueued for deletion again. This is evident in `tasks_pruned` metric which reflects numbers much higher than the actual number of expired tasks deleted. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/pruning/PruningModule.java 735199ac1ccccab343c24471890aa330d6635c26 > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java f77849498ff23616f1d56d133eb218f837ac3413 > src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 14e4040e0b94e96f77068b41454311fa3bf53573 > > > Diff: https://reviews.apache.org/r/56575/diff/5/ > > > Testing > ------- > > Manual testing under Vagrant > > > Thanks, > > Mehrdad Nurolahzade > > --===============7771907323173452697==--