accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Testing Spark Job that uses the AccumuloInputFormat
Date Wed, 03 Aug 2016 15:42:09 GMT
MockAccumulo is also on its way out. It predates MiniAccumuloCluster 
and, for a while, was a very useful tool for running simple tests 
against Accumulo. However, there are numerous edge cases where 
MockAccumulo doesn't actually act like "real" Accumulo (whereas 
MiniAccumuloCluster does -- because it's the same exact code).

Would recommend focusing on MAC. Let us know how we can help further -- 
if you have examples for us to try that exhibit the problems you're 
facing, that would be very helpful.

Keith Turner wrote:
> Mario
>
> A little bit of background, miniaccumulo cluster launches external
> Accumulo processes.   To launch these processes it needs a classpath
> for the JVM.  It tries to obtain that from the current classloader,
> but thats failing.  The following code is causing your problem.   We
> should open a bug about it not working w/ sbt.
>
> https://github.com/apache/accumulo/blob/rel/1.7.2/minicluster/src/main/java/org/apache/accumulo/minicluster/impl/MiniAccumuloClusterImpl.java#L250
>
> One possible work around is to use the accumulo maven plugin.  This
> will launch mini accumulo outside of your test, then your test just
> use that launched instance. Do you think this might work for you?  If
> not maybe we can come up with another workaround.
>
> http://accumulo.apache.org/release_notes/1.6.0#maven-plugin
>
> As for the MiniDFSCluster issue, that should be ok.   We use mini
> accumulo cluster to test Accumulo itself.  Some of this ends up
> bringing in a dependency on MiniDFSCluster, even thought the public
> API for mini cluster does not support using it.  We need to fix this,
> so that there is no dependency on MiniDFSCluster.
>
>
> Keith
>
> On Wed, Aug 3, 2016 at 6:51 AM, Mario Pastorelli
> <mario.pastorelli@teralytics.ch>  wrote:
>> I'm trying to test a spark job that uses the AccumuloInputFormat but I'm
>> having many issues with both MockInstance and MiniAccumuloCluster.
>>
>> 1) MockInstance doesn't work with Spark jobs in my environment because it
>> looks like every task has a different instance of the MockInstance in
>> memory; if I add records from the driver, the executors can't find this
>> data. Is there a way to fix this?
>>
>> 2) MiniAccumuloCluster keeps giving strange errors. Two of them I can't
>> really fix:
>>    a. using sbt to run the tests throws  IllegalArgumentException Unknown
>> classloader type : sbt.classpath.NullLoader when MiniAccumuloClister is
>> instantiated. Anybody knows how to fix this? It's basically preventing me
>> from using Spark and Accumulo together.
>>    b. there is a warn that MiniDFSCluster is not found and a stub is used. I
>> have all the dependencies needed, included hdfs test. Is this warn ok?
>>
>> Thanks for the help,
>> Mario
>>
>> --
>> Mario Pastorelli | TERALYTICS
>>
>> software engineer
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone: +41794381682
>> email: mario.pastorelli@teralytics.ch
>> www.teralytics.net
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
>> de Vries
>>
>> This e-mail message contains confidential information which is for the sole
>> attention and use of the intended recipient. Please notify us at once if you
>> think that it may not be intended for you and delete it immediately.

Mime
View raw message