accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: Accumulo on Google Cloud Storage
Date Wed, 20 Jun 2018 18:20:32 GMT
For what it's worth, this is an Apache project, not a Sqrrl project. Amazon
is free to contribute to Accumulo to improve its support of their platform,
just as anybody is free to do. Amazon may start contributing more as a
result of their acquisition... or they may not. There is no reason to
expect that their acquisition will have any impact whatsoever on the
platforms Accumulo supports, because Accumulo is not, and has not ever
been, a Sqrrl project (although some Sqrrl employees have contributed), and
thus will not become an Amazon project. It has been, and will remain, a
vendor-neutral Apache project. Regardless, we welcome contributions from
anybody which would improve Accumulo's support of any additional platform
alternatives to HDFS, whether it be GCS, S3, or something else.

As for the WAL closing issue on GCS, I recall a previous thread about
that... I think a simple patch might be possible to solve that issue, but
to date, nobody has contributed a fix. If somebody is interested in using
Accumulo on GCS, I'd like to encourage them to submit any bugs they
encounter, and any patches (if they are able) which resolve those bugs. If
they need help submitting a fix, please ask on the dev@ list.

On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <>

> Maxim,
> Interesting that you were able to run A on GCS.  I never thought of
> that--good to know.
> Since I am now an AWS guy (at least or the time being), in light of the
> fact that Amazon purchased Sqrrl,  I am interested to see what develops.
> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <>
> wrote:
>> Hi Geoffry,
>> Thank you for the feedback!
>> Thanks to [1, 2], I was able to run Accumulo cluster on Google VMs and
>> with GCS instead of HDFS. And I used Google Dataproc to run Hadoop jobs on
>> Accumulo. Almost everything was good until I've not faced some connection
>> issues with GCS. Quite often, the connection to GCS breaks on writing or
>> closing WALs.
>> To all,
>> Does Accumulo have a specific write pattern, so that file system may not
>> support it? Are there Accumulo properties which I can play with to adjust
>> the write pattern?
>> [1]:
>> [2]:
>> Thank you!
>> Maxim
>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts <>
>> wrote:
>>> I tried running Accumulo on Google.  I first tried running it on
>>> Google's pre-made Hadoop.  I found the various file paths one must contend
>>> with are different on Google than on a straight download from Apache.  It
>>> seems they moved things around.  To counter this, I installed my own Hadoop
>>> along with Zookeeper and Accumulo on a Google node.  All went well until
>>> one fine day when I could no longer log in.  It seems Google had pushed out
>>> some changes over night that broke my client side Google Cloud
>>> installation.  Google referred the affected to a lengthy,
>>> easy-to-make-a-mistake procedure for resolving the issue.
>>> I decided life was too short for this kind of thing and switched to
>>> Amazon.
>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin <>
>>> wrote:
>>>> Hi all,
>>>> Does anyone have experience running Accumulo on top of Google Cloud
>>>> Storage instead of HDFS? In [1] you can see some details if you never heard
>>>> about this feature.
>>>> I see some discussion (see [2], [3]) around this topic, but it looks to
>>>> me that this isn't as popular as, I believe, should be.
>>>> [1]:
>>>> [2]:
>>>> [3]:
>>>> Best regards,
>>>> Maxim
>>> --
>>> There are ways and there are ways,
>>> Geoffry Roberts
> --
> There are ways and there are ways,
> Geoffry Roberts

View raw message