accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: [DISCUSS] Establishing a contrib repo for upgrade testing
Date Fri, 06 Mar 2015 22:14:56 GMT
Thanks for the responses. Overall, I'm positive towards the inclusion 
given your answers.

Sean Busbey wrote:
> On Fri, Mar 6, 2015 at 12:03 PM, Josh Elser<josh.elser@gmail.com>  wrote:
>
>> First off, thanks for the good-will in taking the time to ask.
>>
>> My biggest concern in adopting it as a codebase would be ensuring that it
>> isn't another codebase dropped into contrib/ and subsequently ignored. How
>> do you plan to avoid this? Who do you see maintaining and running these
>> tests?
>>
>>
> Well, I know I use them when we post candidates. I think it'd be nice if we
> all generally got in the habit. Once they've gotten polished up enough to
> cut a release we could add it to the e.g. major release procedure. That
> would certainly make sure the community stays on it.
>

You definitely read between the lines. Having a tool for anyone's use is 
a plus (I think Christopher touched on this, too). I wanted to make sure 
that adoption of this as a contrib didn't immediately imply that it is 
required testing. That would be a good goal for the codebase, but I 
didn't want them to come as a package-deal.

>
>> Some more targeted implementation observations/questions -
>>
>> * Do you plan to update the scripts to work with Apache Accumulo instead
>> of CDH specific artifacts? e.g. [1]
>>
>
>
> Yeah, that's part of the vendor-specific-details clean up I mentioned.
> FWIW, I've used this for also testing the ASF artifacts and it's worked
> fine.
>

Cool, thanks.

>
>> * For the MapReduce job specifically, why did you write your own and not
>> use an existing "vetted" job like Continuous Ingest? Is there something
>> that the included M/R job does which is not already contained by our CI
>> ingest and verify jobs?
>>
>>
> I need to be able to check that none of the data has been corrupted or
> lost, and I'd prefer to do it quickly. It's possible for the CI job to have
> data corrupted or dropped in a way we can't detect (namely UNREFERENCED
> cells).

It's possible, but unlikely, IMO. In a test at home when a single 
character was changed (by some still unknown factor), the CI verify 
caught it and failed the verification phase.

> The data load job is considerably easier to run (esp at scale) than the CI
> job. Presuming your cluster is configured correctly, you just use the tool
> script and a couple of command line parameters and YARN/MR take care of the
> rest. It will also do this across several tables configured with our
> different storage options, to make sure we have better coverage.

That is a valid point for the ingest portion. I hadn't thought about that.

> The given data verify job is also more parallelizable than the existing
> jobs, since each executor can handle its share of the cells on the map side
> without regard for the others.
>
> For example, from a newly deployed unoptimized cluster I can
> launch-and-forget data load + verify and it will get through ~78M cells in
> each of 4 tables (for a total of 312M cells) on a low-power 5 node cluster
> in around 7 minute load + 2 minute compaction + 2 minute verify without
> using offline scans. (and ~2 min of the load time is taking the
> two-level-pre-split optimization path when it isn't needed on this small
> cluster). It can do more faster on bigger or better tuned clusters, but the
> important bit is that I can check correctness by just telling it where
> Accumulo + MR is.
>
>
>
>> * It looks like the current script only works for 1.4 to 1.6? Do you plan
>> to support 1.5->1.6, 1.5->1.7, 1.6->1.7? How do you envision this adoption
>> occurring?
>>
>>
> The current script only has comments from a couple of vendor releases. I've
> used the overall tooling for ASF releases 1.4 ->  1.5 ->  1.6, 1.4 ->  1.6,
> 1.5. ->  1.6 and 1.6.0 ->  1.6.1.
>
> For the most part, adding in another target version is just a matter of
> checking if the APIs still work. With the adoption of semver, that should
> be pretty easy. I have toyed before with adding a shim layer for our API
> versions and will probably readdress that once there's a 2.0.
>
> So I think adding those other supported bits will mostly be a matter of
> improving the documentation. I'd like to get some ease of use bits
> included, like downloading the release or rc tarballs after a prompt for
> version numbers. At the very least that documentation part will be a part
> of the post-import cleanup.
>
>
>
>> * As far as exercising internal Accumulo implementation, I think you have
>> the basics covered. What about some more tricky things over the metadata
>> table (clone, import, export, merge, split table)? How might additional
>> functionality be added in a way that can be automatically tested?
>>
>>
> Those would be great additions. The current compatibility test is limited
> to data compatibility. Adding in packages for other api hooks (like that
> import/export works across versions) should be just a matter of writing a
> driver that talks to the Accumulo api and then updating the automated
> script.
>
> At least import/export and clone should be relatively easy, to the extent
> that we can leverage the data compatibility tools to put a table in a known
> state and then check that other tables match.
>
>
>
>> * It seems like you have also targeted a physical set of nodes. Have you
>> considered actually using some virtualization platform (e.g. vagrant) to
>> fully automate upgrade-testing? If there is a way that a user can spin up a
>> few VMs to do the testing, the barrier to entry is much lower (and likely
>> more foolproof) than requiring the user to set up the environment.
>>
>>
>
> To date, our main concern has been testing against live clusters. Mostly
> that's an artifact of internal testing procedures. I'd love it if someone
> who's proficient in vagrant or docker or whatever could help add a lower
> barrier test point.
>

Cool. I've been meaning to look at the stuff Wyatt posted on the user 
list a week or two ago as a starting point for making an easy-to-spin up 
Accumulo instance off of a commit. I'd be very excited for a day when 
upgrade testing could be nothing more than `./upgrade-test.sh 1.6.1 
1.6.2-rc0`.

Mime
View raw message