accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Valentin <ar...@arielvalentin.com>
Subject Re: Testing Map Reduce Jobs
Date Wed, 29 Oct 2014 22:42:18 GMT
Thanks Mike. I'll give that a try.

Thanks,
Ariel
---
Sent from my mobile device. Please excuse any errors.

> On Oct 29, 2014, at 6:27 PM, Mike Drob <madrob@cloudera.com> wrote:
> 
> If you launch a MapReduce job in your test code, without having a cluster present, then
it will default into a local runner. This could be easy if you invoke your job through something
like ToolRunner.run(). Or just build a job and invoke it directly. So you don't really need
a mini-mr or yarn cluster for this, 90% of the time.
> 
> As far as integrating that with a mini-accumulo... if you start a MiniAccumuloCluster
manually and keep a reference to it (which you should anyway because you will need to stop
it eventually) then you can use that to populate your AccumuloInputFormat configuration (assuming
you are using it).
> 
> Something like...
> 
>   @Before
>   public void setUp() throws Exception {
>     // Start the Accumulo Cluster
>     mac = new MiniAccumuloCluster(root.newFolder(), ACCUMULO_PASS);
>     mac.start();
> 
>     // Get first connection to create user
>     mac.getConnector(ACCUMULO_USER, ACCUMULO_PASS);
>   }
> 
>   @Test
>   public void setUp() throws Exception {
>     AccumuloInputFormat.setZooKeeperInstance(job, ClientConfiguration.loadDefault().withZkHosts(mac.getZooKeepers()).withInstance(mac.getInstanceName()));
>     // .. and other settings
> 
>     boolean success = job.waitForCompletion(false);
>     assertTrue("Job failed!", success);
>   }
> 
>   @After
>   public void tearDown() throws Exception {
>     mac.stop();
>   }
> 
> Not sure if this is helpful, but hopefully is enough to point you in the right direction.
If you have more questions, please clarify.
> 
> 
> 
>> On Wed, Oct 29, 2014 at 4:16 PM, Ariel Valentin <ariel@arielvalentin.com> wrote:
>> I am looking for some guidance that will help me write better tests for our map reduce
jobs. My current jobs are tested using MRUnit, which covers most of the "logic" but I feel
like I am missing good "end-to-end" developer tests. 
>> 
>> I took a look at the tests for mapred classes but I am not sure that it achieves
my goal of an end-to-end test because of the use of MockInstance.
>> https://github.com/apache/accumulo/blob/master/mapreduce/src/test/java/org/apache/accumulo/core/client/mapred{,reduce}/
>> 
>> For me the characteristic of a end-to-end test that I would find valuable is a suite
that one could execute using mini-{accumulo,yarn,et.al.} but I don't see any examples of how
one would go about making those components work in concert with each other. 
>> 
>> Does anyone have any guidance when it comes to writing automated developer end-to-end
tests? 
>> 
>> What kinds of testing strategies are people out there using for MR jobs?
>> 
>> Thanks,
>> Ariel Valentin
>> e-mail: ariel@arielvalentin.com
>> website: http://blog.arielvalentin.com
>> skype: ariel.s.valentin
>> twitter: arielvalentin
>> linkedin: http://www.linkedin.com/profile/view?id=8996534
>> ---------------------------------------
>> *simplicity *communication
>> *feedback *courage *respect
> 

Mime
View raw message