accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Python client lib for Accumulo?
Date Fri, 27 Jul 2012 16:43:37 GMT
Does anyone know anything about Py4J?

I have never used it, but I am wondering if it would fit the bill?

On Thu, Jul 26, 2012 at 11:15 PM, Edmon Begoli <> wrote:
> Hi folks,
> I have just joined the list with the purpose of volunteering ideas,
> design and development (and whatever else in lifecycle)
> related to development of the Python client for accumulo.
> I have developed several RESTful clients and libraries before using
> and I am about to write another in Tornado
> (
> I think that we could have a very nice, scalable and fast RESTful API
> for Accumulo through Tornado.
> I would also like to develop pure Python library for accumulo similar
> to HappyBase for HBase (
> I work at Oak Ridge National Lab as a software engineer and tech. lead
> on "big data" projects,
> I can devote time, possibly bring more team members and I would be
> happy to collaborate. Collaborations are welcome.
> I could certainly start a small wiki outlining the ideas and open them
> for discussion.
> Regards and please advise,
> Edmon
> On Wed, May 2, 2012 at 11:31 AM, Jason Trost <> wrote:
>> I noticed that there are no JIRAs for a python client
>> interface/lib/API for Accumulo.  How involved would it be to develop
>> AND maintain a python client for Accumulo?
>> I realize that Jython can be used, but I am interested in a native
>> python lib that can be use more broadly with systems that don't work
>> with Jython.
>> In order to do this, it seems like we would need to:
>> 1. generate the python thrift bindings code (this is trivial)
>> 2. develop and maintain the python glue code to use the thrift code
>> and python zookeeper code to interact with the various accumulo
>> components.  The current Java "glue" code looks quite long.  How often
>> does this code change (in terms of new features or changes in
>> protocol, not bug fixes)?
> I would advise against rewriting the accumulo client code in python.
> The code that finds tablets, retries in case of failure, parallelizes
> read/writes, etc is fairly complex.  I think the proxy option is best.
>  David and Eric mentioned REST and Thrift proxies.
> If we were to go to down the route of writing the client code in
> another language, I think C++ with a C API would be the best option
> because many language can easily bind to a C API.
>> Ideally the python API would be very similar to the Java interface
>> (Connector, Instance, Scanner, BatchScanner, BatchWriter, Key, Value,
>> Mutation, etc).
>> I guess what I am trying to get at is, does the Accumulo dev community
>> think it's worth the time and effort to develop and maintain a python
>> API?  I personally think it is in order to help with adoption and
>> integration with other systems (Django is the primary system I want to
>> be able to use with it).  I have some time to help this along, but I
>> don't think I have enough time to take this on alone.  Is anyone else
>> interested in working together on this?
>> Thanks,
>> --Jason

View raw message