Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 14141200B6C for ; Sun, 14 Aug 2016 00:40:29 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 12B69160AB2; Sat, 13 Aug 2016 22:40:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3B506160AA6 for ; Sun, 14 Aug 2016 00:40:28 +0200 (CEST) Received: (qmail 92992 invoked by uid 500); 13 Aug 2016 22:40:27 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 92970 invoked by uid 99); 13 Aug 2016 22:40:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Aug 2016 22:40:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B254CC041B for ; Sat, 13 Aug 2016 22:40:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.474 X-Spam-Level: X-Spam-Status: No, score=0.474 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RP_MATCHES_RCVD=-1.426] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=cs.washington.edu Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id x0YqbBsYaYUU for ; Sat, 13 Aug 2016 22:40:24 +0000 (UTC) Received: from ns5.cs.washington.edu (ns5.cs.washington.edu [128.208.5.1]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id EBF4E5FC12 for ; Sat, 13 Aug 2016 22:40:13 +0000 (UTC) Received: from ns5.cs.washington.edu (localhost [127.0.0.1]) by ns5.cs.washington.edu (8.15.2/8.15.2/1.15) with ESMTP id u7DMe5aI001808 for ; Sat, 13 Aug 2016 15:40:06 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.washington.edu; s=csw201206; t=1471128006; bh=7aXVVTQ4wkIPS7WwZ9fEoDtwEHIgf9ToDGTu6EkMxEA=; h=In-Reply-To:References:From:Date:Subject:To; b=c6Opi7y0spb4oxxfF0zG/0EcRzRSHAtRZl18U0k6GMVuBNlsjYTU94oL5Wr6U7kBF jCMO6iv+CyTSMvaO9xSOwRNE0EyYMwR0xIN1nxjJSTHD2Dd7IhlIruvkDPw+dfJ4gQ K/tPGHWH8CpU9PzP+LPag2n6OadsEh5MBcjBWL2o= Received: from mail-ua0-f172.google.com (mail-ua0-f172.google.com [209.85.217.172]) (authenticated bits=0) by ns5.cs.washington.edu (8.15.2/8.15.2/1.15) with ESMTPSA id u7DMe5Ij001805 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Sat, 13 Aug 2016 15:40:05 -0700 Received: by mail-ua0-f172.google.com with SMTP id 74so28746923uau.0 for ; Sat, 13 Aug 2016 15:40:05 -0700 (PDT) X-Gm-Message-State: AEkoouvMinSx1SWppMxgP7fG13RK63KGIvPw3rDSGxSwkiueqQtg8zu9+ergreWIxJvG6tNvVRL3Yyxfzbe3rQ== X-Received: by 10.31.154.21 with SMTP id c21mr6531789vke.36.1471128004910; Sat, 13 Aug 2016 15:40:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.80.73 with HTTP; Sat, 13 Aug 2016 15:39:44 -0700 (PDT) In-Reply-To: References: <57AF6E99.2000300@gmail.com> <57AF7CF1.4060401@gmail.com> <57AF8788.5000603@gmail.com> From: Dylan Hutchison Date: Sat, 13 Aug 2016 15:39:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Dealing with FastBulkImportIT To: Accumulo Dev List Content-Type: multipart/alternative; boundary=001a1140ed4426e70d0539fbaeae archived-at: Sat, 13 Aug 2016 22:40:29 -0000 --001a1140ed4426e70d0539fbaeae Content-Type: text/plain; charset=UTF-8 > > I think my only concern there is that, in our past, these tests tend to > be ignored and die. > Good reason to make these tests a requirement for releasing, similar to the way we use the RandomWalk and ContinuousIngest tests. They serve as a check against performance decreasing changes. Developers are free to run them more often. On Sat, Aug 13, 2016 at 3:33 PM, Dylan Hutchison wrote: > ACCUMULO-3327 is a > perfect example of a performance bug. The tablet servers used to reload > the bulk imported flags from the metadata table with every request. There > is nothing wrong with the extra reloads in terms of correctness, but it > does slow the import process down. This aspect makes it hard to test. > > The JUnit category is a nice idea. One idea to complement it is the > following procedure: > > 1. Run each performance test on code *from an earlier, reference > commit*. If a test fails, then there is a correctness problem and it > should be treated as a failed test as usual. If the tests all pass, write > out the performance times to a baseline file in a special folder, maybe > /bench. > 2. Run each performance test again, now on the new commit you want to > test. Compare runtimes. If a runtime for some test increased > "significantly" (say >10%; per-test user-configurable by annotation), then > flag that to the user. Maybe treat that as a failure. > 3. The output timings from these tests should be a human-readable > report. > > I bet there are frameworks out there for this kind of thing. They might > have some out-of-the-box functions like warming up code by running it once > before timing it. But it may be easier to whip up a simple solution using > JUnit. > > Also: we might embrace our friends in Apache HTrace > . HTrace makes it simple to time > and log specific spans of code. We could create a SpanReceiver to gather > times we are interested in for the report above. > > > On Sat, Aug 13, 2016 at 1:48 PM, Josh Elser wrote: > >> You're completely right. The separation of performance tests and >> correctness tests is one path forward. I think my only concern there is >> that, in our past, these tests tend to be ignored and die. >> >> I think the rest this is in the normal bucket of ITs is just because we >> don't have rigor in your 4th point about perf evaluations. >> >> Maybe, we could make some junit category to annotate such tests and make >> them runnable via Maven, removing them from normal execution. I think that >> would be an acceptable way forward. >> >> However, that would leave us with no end-to-end test for ACCUMULO-3327 >> which isn't great.. >> >> >> Dylan Hutchison wrote: >> >>> Hi Josh, >>> >>> Forgive me for the design question, but shouldn't we distinguish tests of >>> correctness from tests of performance? The following is my understanding >>> of >>> test categories, which does not totally align with Accumulo's test suite: >>> >>> * Unit tests test individual components. >>> * Integration tests test using components together. They may require more >>> resources such as starting an Accumulo (MAC or real). >>> * Examples are executable code separate from the above, that an outside >>> developer or user can read to see how Accumulo is used. Examples have >>> their >>> own tests. >>> * Performance evaluations are executable code separate from the above. >>> They >>> range in complexity from simple "test bulk imports" to RabdomWalk with >>> agitation. >>> >>> If performance evaluations run separately, then developers can treat then >>> like benchmarks, comparing times to those on similar hardware or across >>> commits. >>> >>> Could you remind me of the reasons why we keep performance tests in the >>> standard set of ITs? >>> >>> On Aug 13, 2016 1:03 PM, "Josh Elser" wrote: >>> >>> I had assumed this test would pass locally (early-2013 MBP, 2.7 GHz Intel >>>> Core i7, 16G ram), but nope! 38s and 45+ seconds on two runs. >>>> >>>> Josh Elser wrote: >>>> >>>> Hi, >>>>> >>>>> I have some complaints about FastBulkImportIT (a test added with >>>>> https://issues.apache.org/jira/browse/ACCUMULO-3327) but no good ideas >>>>> for how to better test it. As it presently stands, it is a very >>>>> subjective test WRT the kind of hardware used to run it. >>>>> >>>>> The test launches a 3-tserver MAC instance, creates about 585 splits on >>>>> a table, creates 100 files with ~1200 key-value pairs, and then waits >>>>> for the table to be balanced. >>>>> >>>>> At this point, it imports these files into that table and fails if that >>>>> takes longer than 30s. >>>>> >>>>> On my VPS (3core, 6G ram, "SSD"), the bulk import takes ~45 seconds. >>>>> This test will never pass on this node which bothers me because I am of >>>>> the opinion that anyone (with reasonable hardware) should be able to >>>>> run >>>>> our tests (and to make sure it's clear: I believe this is reasonable >>>>> hardware). >>>>> >>>>> Does anyone have any thoughts on how we could stabilize this test for >>>>> developers? >>>>> >>>>> - Josh >>>>> >>>>> >>> > --001a1140ed4426e70d0539fbaeae--