accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1804) Integrate RStudio to work with data residing in Accumulo
Date Wed, 23 Oct 2013 16:21:41 GMT


Josh Elser commented on ACCUMULO-1804:

[~aarongmldt], thanks for the code! Digging through some of the source, it looks like you
used the C++ interface for the Thrift proxy to connect R with Accumulo. That, combined with
some of the API calls, it looks like raccumulo requires at least Accumulo 1.5.0 and an instance
of the Thrift proxy running. Is that correct?

It's been quite a while since I've used R; where do you see this fitting into the Accumulo
community? Generally speaking, we have two avenues which integration code like this falls

# Inclusion in core Accumulo codebase
# A "contrib" project of Accumulo

For #1, this would typically require a committer to sign up to ensure that the code is well-maintained
as Accumulo itself grows. It is held to a certain level of testing and has a good expectation
of working as expected since it would be released with Accumulo itself. For #2, a contrib
project is a means for Accumulo to keep related, developed code near Accumulo. These projects
typically follow their own schedule and aren't crucial to a release of Accumulo itself.

To me, it seems like a contrib project is the best location for it at the moment. What do
you think? Other committers? Do you intend to maintain and add additional functionality to
raccumulo as people use it and find bugs or improvements?

Thanks again. It's awesome to see contributions like this!

> Integrate RStudio to work with data residing in Accumulo
> --------------------------------------------------------
>                 Key: ACCUMULO-1804
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Aaron Glahe
>            Priority: Minor
>         Attachments: raccumulo-release.tar.gz
> Need to be able to support users who utilize RStudio to conduct analysis of data residing
in the Accumulo data space instead of moving data from one repository to a stand alone system
to have the analytic run in memory.  RStudio should be able to make calls directly to the
data space and provide the output within the RStudio interface.

This message was sent by Atlassian JIRA

View raw message