accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John R. Frank" <>
Subject Re: Optimal # proxy servers
Date Tue, 12 Aug 2014 18:46:22 GMT
> There is also which was a start at 
> a C++ client to Accumulo. I'm not sure the state of it.


> In short, it's possible, but like Eric said, the Java client does quite 
> a bit more than just RPC to other processes.

Yes, understood.

Looks like the client also has the necessary information for figuring out 
which tablets are near the compute worker, which would enable bringing the 
computational task to the data.

> Regarding python+proxy, extra RPC is definitely a concern, but I'm not 
> sure how much of the performance decrease is the use of an 
> interpreted/dynamic language and how much is using the Proxy. I haven't 
> ever benchmarked the two to get a good understanding of where the extra 
> time is really spent.

Yes, its easy to waste a lot of cycles with trivial things like python 
object creation.  After figuring things out, it's often fruitful to 
profile a system on real data and migrate select pieces to native 
implementations in C/C++.  This makes python the easily refactored glue 
between optimized components.

For now, we're experimenting with running more proxies closer to the 
compute workers.  We are interested in a C++ client for Accumulo, which 
could be made to expose python interfaces.


View raw message