accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dylan Hutchison <dhutc...@cs.washington.edu>
Subject Re: Dealing with FastBulkImportIT
Date Sat, 13 Aug 2016 22:39:44 GMT
>
>  I think my only concern there is that, in our past, these tests tend to
> be ignored and die.
>

Good reason to make these tests a requirement for releasing, similar to the
way we use the RandomWalk and ContinuousIngest tests.  They serve as a
check against performance decreasing changes.  Developers are free to run
them more often.

On Sat, Aug 13, 2016 at 3:33 PM, Dylan Hutchison <dhutchis@cs.washington.edu
> wrote:

> ACCUMULO-3327 <https://issues.apache.org/jira/browse/ACCUMULO-3327> is a
> perfect example of a performance bug.  The tablet servers used to reload
> the bulk imported flags from the metadata table with every request.  There
> is nothing wrong with the extra reloads in terms of correctness, but it
> does slow the import process down.  This aspect makes it hard to test.
>
> The JUnit category is a nice idea.  One idea to complement it is the
> following procedure:
>
>    1. Run each performance test on code *from an earlier, reference
>    commit*.  If a test fails, then there is a correctness problem and it
>    should be treated as a failed test as usual.  If the tests all pass, write
>    out the performance times to a baseline file in a special folder, maybe
>    <accumuilo_src_dir>/bench.
>    2. Run each performance test again, now on the new commit you want to
>    test.  Compare runtimes.  If a runtime for some test increased
>    "significantly" (say >10%; per-test user-configurable by annotation), then
>    flag that to the user.  Maybe treat that as a failure.
>    3. The output timings from these tests should be a human-readable
>    report.
>
> I bet there are frameworks out there for this kind of thing.  They might
> have some out-of-the-box functions like warming up code by running it once
> before timing it.  But it may be easier to whip up a simple solution using
> JUnit.
>
> Also: we might embrace our friends in Apache HTrace
> <https://htrace.incubator.apache.org/>.  HTrace makes it simple to time
> and log specific spans of code.  We could create a SpanReceiver to gather
> times we are interested in for the report above.
>
>
> On Sat, Aug 13, 2016 at 1:48 PM, Josh Elser <josh.elser@gmail.com> wrote:
>
>> You're completely right. The separation of performance tests and
>> correctness tests is one path forward. I think my only concern there is
>> that, in our past, these tests tend to be ignored and die.
>>
>> I think the rest this is in the normal bucket of ITs is just because we
>> don't have rigor in your 4th point about perf evaluations.
>>
>> Maybe, we could make some junit category to annotate such tests and make
>> them runnable via Maven, removing them from normal execution. I think that
>> would be an acceptable way forward.
>>
>> However, that would leave us with no end-to-end test for ACCUMULO-3327
>> which isn't great..
>>
>>
>> Dylan Hutchison wrote:
>>
>>> Hi Josh,
>>>
>>> Forgive me for the design question, but shouldn't we distinguish tests of
>>> correctness from tests of performance? The following is my understanding
>>> of
>>> test categories, which does not totally align with Accumulo's test suite:
>>>
>>> * Unit tests test individual components.
>>> * Integration tests test using components together. They may require more
>>> resources such as starting an Accumulo (MAC or real).
>>> * Examples are executable code separate from the above, that an outside
>>> developer or user can read to see how Accumulo is used. Examples have
>>> their
>>> own tests.
>>> * Performance evaluations are executable code separate from the above.
>>> They
>>> range in complexity from simple "test bulk imports" to RabdomWalk with
>>> agitation.
>>>
>>> If performance evaluations run separately, then developers can treat then
>>> like benchmarks, comparing times to those on similar hardware or across
>>> commits.
>>>
>>> Could you remind me of the reasons why we keep performance tests in the
>>> standard set of ITs?
>>>
>>> On Aug 13, 2016 1:03 PM, "Josh Elser"<josh.elser@gmail.com>  wrote:
>>>
>>> I had assumed this test would pass locally (early-2013 MBP, 2.7 GHz Intel
>>>> Core i7, 16G ram), but nope! 38s and 45+ seconds on two runs.
>>>>
>>>> Josh Elser wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I have some complaints about FastBulkImportIT (a test added with
>>>>> https://issues.apache.org/jira/browse/ACCUMULO-3327) but no good ideas
>>>>> for how to better test it. As it presently stands, it is a very
>>>>> subjective test WRT the kind of hardware used to run it.
>>>>>
>>>>> The test launches a 3-tserver MAC instance, creates about 585 splits
on
>>>>> a table, creates 100 files with ~1200 key-value pairs, and then waits
>>>>> for the table to be balanced.
>>>>>
>>>>> At this point, it imports these files into that table and fails if that
>>>>> takes longer than 30s.
>>>>>
>>>>> On my VPS (3core, 6G ram, "SSD"), the bulk import takes ~45 seconds.
>>>>> This test will never pass on this node which bothers me because I am
of
>>>>> the opinion that anyone (with reasonable hardware) should be able to
>>>>> run
>>>>> our tests (and to make sure it's clear: I believe this is reasonable
>>>>> hardware).
>>>>>
>>>>> Does anyone have any thoughts on how we could stabilize this test for
>>>>> developers?
>>>>>
>>>>> - Josh
>>>>>
>>>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message