accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Kolchin <kolchin...@gmail.com>
Subject Re: Accumulo on Google Cloud Storage
Date Tue, 15 Jan 2019 15:02:17 GMT
Hi,

I just wanted to leave intermediate feedback on the topic.

So far, Accumulo works pretty well on top of Google Storage. The
aforementioned issue still exists, but it doesn't break anything. However,
I can't give you any useful performance numbers at the moment.

The cluster:

 - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
 - 32+ billlion entries
 - 5 tables (excluding system tables)

Some averaged numbers from two use cases:

 - batch write into pre-splitted tables with 40 client machines + 4
tservers (n1-standard-4) - max speed 1.5M entries/sec.
 - sequential read with 2 client iterators (1 - filters by key, 2- filters
by timestamp), with 5 client machines +  2 tservers (n1-standard-4 ) and
less than 60k entries returned - max speed 1M+ entries/sec.

Maxim

On Mon, Jun 25, 2018 at 12:57 AM Christopher <ctubbsii@apache.org> wrote:

> Ah, ok. One of the comments on the issue led me to believe that it was the
> same issue as the missing custom log closer.
>
> On Sat, Jun 23, 2018, 01:10 Stephen Meyles <smeyles@gmail.com> wrote:
>
>> > I'm not convinced this is a write pattern issue, though. I commented
>> on..
>>
>> The note there suggests the need for a LogCloser implementation; in my
>> (ADLS) case I've written one and have it configured - the exception I'm
>> seeing involves failures during writes, not during recovery (though it then
>> leads to a need for recovery).
>>
>> S.
>>
>> On Fri, Jun 22, 2018 at 4:33 PM, Christopher <ctubbsii@apache.org> wrote:
>>
>>> Unfortunately, that feature wasn't added until 2.0, which hasn't yet
>>> been released, but I'm hoping it will be later this year.
>>>
>>> However, I'm not convinced this is a write pattern issue, though. I
>>> commented on
>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543
>>>
>>> On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles <smeyles@gmail.com>
>>> wrote:
>>>
>>>> Knowing that HBase has been run successfully on ADLS, went looking
>>>> there (as they have the same WAL write pattern). This is informative:
>>>>
>>>>
>>>> https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html
>>>>
>>>> which suggests a need to split the WALs off on HDFS proper versus ADLS
>>>> (or presumably GCS) barring changes in the underlying semantics of each.
>>>> AFAICT you can't currently configure Accumulo to send WAL logs to a
>>>> separate cluster - is this correct?
>>>>
>>>> S.
>>>>
>>>>
>>>> On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles <smeyles@gmail.com>
>>>> wrote:
>>>>
>>>>> > Did you try to adjust any Accumulo properties to do bigger writes
>>>>> less frequently or something like that?
>>>>>
>>>>> We're using BatchWriters and sending reasonable larges batches of
>>>>> Mutations. Given the stack traces in both our cases are related to WAL
>>>>> writes it seems like batch size would be the only tweak available here
>>>>> (though, without reading the code carefully it's not even clear to me
that
>>>>> is impactful) but if there others have suggestions I'd be happy to try.
>>>>>
>>>>> Given we have this working well and stable in other clusters atop
>>>>> traditional HDFS I'm currently pursuing this further with the MS to
>>>>> understand the variance to ADLS. Depending what emerges from that I may
>>>>> circle back with more details and a bug report and start digging in more
>>>>> deeply to the relevant code in Accumulo.
>>>>>
>>>>> S.
>>>>>
>>>>>
>>>>> On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin <kolchinmax@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> > If somebody is interested in using Accumulo on GCS, I'd like
to
>>>>>> encourage them to submit any bugs they encounter, and any patches
(if they
>>>>>> are able) which resolve those bugs.
>>>>>>
>>>>>> I'd like to contribute a fix, but I don't know where to start. We
>>>>>> tried to get any help from the Google Support about [1] over email,
but
>>>>>> they just say that the GCS doesn't support such write pattern. In
the end,
>>>>>> we can only guess how to adjust the Accumulo behaviour to minimise
broken
>>>>>> connections to the GCS.
>>>>>>
>>>>>> BTW although we observe this exception, the tablet server doesn't
>>>>>> fail, so it means that after some retries it is able to write WALs
to GCS.
>>>>>>
>>>>>> @Stephen,
>>>>>>
>>>>>> > as discussions with MS engineers have suggested, similar to
the
>>>>>> GCS thread, that small writes at high volume are, at best, suboptimal
for
>>>>>> ADLS.
>>>>>>
>>>>>> Did you try to adjust any Accumulo properties to do bigger writes
>>>>>> less frequently or something like that?
>>>>>>
>>>>>> [1]:
>>>>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>>>>>>
>>>>>> Maxim
>>>>>>
>>>>>> On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles <smeyles@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I think we're seeing something similar but in our case we're
trying
>>>>>>> to run Accumulo atop ADLS. When we generate sufficient write
load we start
>>>>>>> to see stack traces like the following:
>>>>>>>
>>>>>>> [log.DfsLogger] ERROR: Failed to write log entries
>>>>>>> java.io.IOException: attempting to write to a closed stream;
>>>>>>> at
>>>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
>>>>>>> at
>>>>>>> com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
>>>>>>> at
>>>>>>> org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
>>>>>>> at
>>>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>>>>>>> at java.io.DataOutputStream.write(DataOutputStream.java:88)
>>>>>>> at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
>>>>>>> at
>>>>>>> org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
>>>>>>> at
>>>>>>> org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
>>>>>>>
>>>>>>> We have developed a rudimentary LogCloser implementation that
allows
>>>>>>> us to recover from this but overall performance is significantly
impacted
>>>>>>> by this.
>>>>>>>
>>>>>>> > As for the WAL closing issue on GCS, I recall a previous
thread
>>>>>>> about that
>>>>>>>
>>>>>>> I searched more for this but wasn't able to find anything, nor
>>>>>>> similar re: ADL. I am also curious about the earlier question:
>>>>>>>
>>>>>>> >> Does Accumulo have a specific write pattern [to WALs],
so that
>>>>>>> file system may not support it?
>>>>>>>
>>>>>>> as discussions with MS engineers have suggested, similar to the
GCS
>>>>>>> thread, that small writes at high volume are, at best, suboptimal
for ADLS.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Stephen
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 20, 2018 at 11:20 AM, Christopher <ctubbsii@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> For what it's worth, this is an Apache project, not a Sqrrl
>>>>>>>> project. Amazon is free to contribute to Accumulo to improve
its support of
>>>>>>>> their platform, just as anybody is free to do. Amazon may
start
>>>>>>>> contributing more as a result of their acquisition... or
they may not.
>>>>>>>> There is no reason to expect that their acquisition will
have any impact
>>>>>>>> whatsoever on the platforms Accumulo supports, because Accumulo
is not, and
>>>>>>>> has not ever been, a Sqrrl project (although some Sqrrl employees
have
>>>>>>>> contributed), and thus will not become an Amazon project.
It has been, and
>>>>>>>> will remain, a vendor-neutral Apache project. Regardless,
we welcome
>>>>>>>> contributions from anybody which would improve Accumulo's
support of any
>>>>>>>> additional platform alternatives to HDFS, whether it be GCS,
S3, or
>>>>>>>> something else.
>>>>>>>>
>>>>>>>> As for the WAL closing issue on GCS, I recall a previous
thread
>>>>>>>> about that... I think a simple patch might be possible to
solve that issue,
>>>>>>>> but to date, nobody has contributed a fix. If somebody is
interested in
>>>>>>>> using Accumulo on GCS, I'd like to encourage them to submit
any bugs they
>>>>>>>> encounter, and any patches (if they are able) which resolve
those bugs. If
>>>>>>>> they need help submitting a fix, please ask on the dev@ list.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 20, 2018 at 8:21 AM Geoffry Roberts <
>>>>>>>> threadedblue@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Maxim,
>>>>>>>>>
>>>>>>>>> Interesting that you were able to run A on GCS.  I never
thought
>>>>>>>>> of that--good to know.
>>>>>>>>>
>>>>>>>>> Since I am now an AWS guy (at least or the time being),
in light
>>>>>>>>> of the fact that Amazon purchased Sqrrl,  I am interested
to see what
>>>>>>>>> develops.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 20, 2018 at 5:15 AM, Maxim Kolchin <
>>>>>>>>> kolchinmax@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Geoffry,
>>>>>>>>>>
>>>>>>>>>> Thank you for the feedback!
>>>>>>>>>>
>>>>>>>>>> Thanks to [1, 2], I was able to run Accumulo cluster
on Google
>>>>>>>>>> VMs and with GCS instead of HDFS. And I used Google
Dataproc to run Hadoop
>>>>>>>>>> jobs on Accumulo. Almost everything was good until
I've not faced some
>>>>>>>>>> connection issues with GCS. Quite often, the connection
to GCS breaks on
>>>>>>>>>> writing or closing WALs.
>>>>>>>>>>
>>>>>>>>>> To all,
>>>>>>>>>>
>>>>>>>>>> Does Accumulo have a specific write pattern, so that
file system
>>>>>>>>>> may not support it? Are there Accumulo properties
which I can play with to
>>>>>>>>>> adjust the write pattern?
>>>>>>>>>>
>>>>>>>>>> [1]: https://github.com/cybermaggedon/accumulo-gs
>>>>>>>>>> [2]: https://github.com/cybermaggedon/accumulo-docker
>>>>>>>>>>
>>>>>>>>>> Thank you!
>>>>>>>>>> Maxim
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 19, 2018 at 10:31 PM Geoffry Roberts
<
>>>>>>>>>> threadedblue@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I tried running Accumulo on Google.  I first
tried running it on
>>>>>>>>>>> Google's pre-made Hadoop.  I found the various
file paths one must contend
>>>>>>>>>>> with are different on Google than on a straight
download from Apache.  It
>>>>>>>>>>> seems they moved things around.  To counter this,
I installed my own Hadoop
>>>>>>>>>>> along with Zookeeper and Accumulo on a Google
node.  All went well until
>>>>>>>>>>> one fine day when I could no longer log in. 
It seems Google had pushed out
>>>>>>>>>>> some changes over night that broke my client
side Google Cloud
>>>>>>>>>>> installation.  Google referred the affected to
a lengthy,
>>>>>>>>>>> easy-to-make-a-mistake procedure for resolving
the issue.
>>>>>>>>>>>
>>>>>>>>>>> I decided life was too short for this kind of
thing and switched
>>>>>>>>>>> to Amazon.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 19, 2018 at 7:34 AM, Maxim Kolchin
<
>>>>>>>>>>> kolchinmax@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Does anyone have experience running Accumulo
on top of Google
>>>>>>>>>>>> Cloud Storage instead of HDFS? In [1] you
can see some details if you never
>>>>>>>>>>>> heard about this feature.
>>>>>>>>>>>>
>>>>>>>>>>>> I see some discussion (see [2], [3]) around
this topic, but it
>>>>>>>>>>>> looks to me that this isn't as popular as,
I believe, should be.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]:
>>>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage
>>>>>>>>>>>> [2]: https://github.com/apache/accumulo/issues/428
>>>>>>>>>>>> [3]:
>>>>>>>>>>>> https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Maxim
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> There are ways and there are ways,
>>>>>>>>>>>
>>>>>>>>>>> Geoffry Roberts
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> There are ways and there are ways,
>>>>>>>>>
>>>>>>>>> Geoffry Roberts
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>

Mime
View raw message