accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Accumulo on Google Cloud Storage
Date Thu, 17 Jan 2019 18:50:09 GMT
Thanks for sharing, Maxim.

What kind of failure/recovery testing did you do as a part of this? If 
you haven't done any yet, are you planning to do some such testing?

- Josh

On 1/15/19 10:02 AM, Maxim Kolchin wrote:
> Hi,
> 
> I just wanted to leave intermediate feedback on the topic.
> 
> So far, Accumulo works pretty well on top of Google Storage. The 
> aforementioned issue still exists, but it doesn't break anything. 
> However, I can't give you any useful performance numbers at the moment.
> 
> The cluster:
> 
>   - master (with zookeeper) (n1-standard-1) + 2 tservers (n1-standard-4)
>   - 32+ billlion entries
>   - 5 tables (excluding system tables)
> 
> Some averaged numbers from two use cases:
> 
>   - batch write into pre-splitted tables with 40 client machines + 4 
> tservers (n1-standard-4) - max speed 1.5M entries/sec.
>   - sequential read with 2 client iterators (1 - filters by key, 2- 
> filters by timestamp), with 5 client machines +  2 tservers 
> (n1-standard-4 ) and less than 60k entries returned - max speed 1M+ 
> entries/sec.
> 
> Maxim
> 
> On Mon, Jun 25, 2018 at 12:57 AM Christopher <ctubbsii@apache.org 
> <mailto:ctubbsii@apache.org>> wrote:
> 
>     Ah, ok. One of the comments on the issue led me to believe that it
>     was the same issue as the missing custom log closer.
> 
>     On Sat, Jun 23, 2018, 01:10 Stephen Meyles <smeyles@gmail.com
>     <mailto:smeyles@gmail.com>> wrote:
> 
>          > I'm not convinced this is a write pattern issue, though. I
>         commented on..
> 
>         The note there suggests the need for a LogCloser implementation;
>         in my (ADLS) case I've written one and have it configured - the
>         exception I'm seeing involves failures during writes, not during
>         recovery (though it then leads to a need for recovery).
> 
>         S.
> 
>         On Fri, Jun 22, 2018 at 4:33 PM, Christopher
>         <ctubbsii@apache.org <mailto:ctubbsii@apache.org>> wrote:
> 
>             Unfortunately, that feature wasn't added until 2.0, which
>             hasn't yet been released, but I'm hoping it will be later
>             this year.
> 
>             However, I'm not convinced this is a write pattern issue,
>             though. I commented on
>             https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103#issuecomment-399608543
> 
>             On Fri, Jun 22, 2018 at 1:50 PM Stephen Meyles
>             <smeyles@gmail.com <mailto:smeyles@gmail.com>> wrote:
> 
>                 Knowing that HBase has been run successfully on ADLS,
>                 went looking there (as they have the same WAL write
>                 pattern). This is informative:
> 
>                 https://www.cloudera.com/documentation/enterprise/5-12-x/topics/admin_using_adls_storage_with_hbase.html
> 
>                 which suggests a need to split the WALs off on HDFS
>                 proper versus ADLS (or presumably GCS) barring changes
>                 in the underlying semantics of each. AFAICT you can't
>                 currently configure Accumulo to send WAL logs to a
>                 separate cluster - is this correct?
> 
>                 S.
> 
> 
>                 On Fri, Jun 22, 2018 at 9:07 AM, Stephen Meyles
>                 <smeyles@gmail.com <mailto:smeyles@gmail.com>> wrote:
> 
>                     > Did you try to adjust any Accumulo properties to do
>                     bigger writes less frequently or something like that?
> 
>                     We're using BatchWriters and sending reasonable
>                     larges batches of Mutations. Given the stack traces
>                     in both our cases are related to WAL writes it seems
>                     like batch size would be the only tweak available
>                     here (though, without reading the code carefully
>                     it's not even clear to me that is impactful) but if
>                     there others have suggestions I'd be happy to try.
> 
>                     Given we have this working well and stable in other
>                     clusters atop traditional HDFS I'm currently
>                     pursuing this further with the MS to understand the
>                     variance to ADLS. Depending what emerges from that I
>                     may circle back with more details and a bug report
>                     and start digging in more deeply to the relevant
>                     code in Accumulo.
> 
>                     S.
> 
> 
>                     On Fri, Jun 22, 2018 at 6:09 AM, Maxim Kolchin
>                     <kolchinmax@gmail.com <mailto:kolchinmax@gmail.com>>
>                     wrote:
> 
>                         > If somebody is interested in using Accumulo on GCS, I'd
like to encourage them to submit any bugs they encounter, and any patches (if they are able)
which resolve those bugs.
> 
>                         I'd like to contribute a fix, but I don't know
>                         where to start. We tried to get any help from
>                         the Google Support about [1] over email, but
>                         they just say that the GCS doesn't support such
>                         write pattern. In the end, we can only guess how
>                         to adjust the Accumulo behaviour to minimise
>                         broken connections to the GCS.
> 
>                         BTW although we observe this exception, the
>                         tablet server doesn't fail, so it means that
>                         after some retries it is able to write WALs to GCS.
> 
>                         @Stephen,
> 
>                         > as discussions with MS engineers have suggested,
>                         similar to the GCS thread, that small writes at
>                         high volume are, at best, suboptimal for ADLS.
> 
>                         Did you try to adjust any Accumulo properties to
>                         do bigger writes less frequently or something
>                         like that?
> 
>                         [1]:
>                         https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
> 
>                         Maxim
> 
>                         On Thu, Jun 21, 2018 at 7:17 AM Stephen Meyles
>                         <smeyles@gmail.com <mailto:smeyles@gmail.com>>
>                         wrote:
> 
>                             I think we're seeing something similar but
>                             in our case we're trying to run Accumulo
>                             atop ADLS. When we generate sufficient write
>                             load we start to see stack traces like the
>                             following:
> 
>                             [log.DfsLogger] ERROR: Failed to write log
>                             entries
>                             java.io.IOException: attempting to write to
>                             a closed stream;
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:88)
>                             at
>                             com.microsoft.azure.datalake.store.ADLFileOutputStream.write(ADLFileOutputStream.java:77)
>                             at
>                             org.apache.hadoop.fs.adl.AdlFsOutputStream.write(AdlFsOutputStream.java:57)
>                             at
>                             org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:48)
>                             at
>                             java.io.DataOutputStream.write(DataOutputStream.java:88)
>                             at
>                             java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
>                             at
>                             org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87)
>                             at
>                             org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:537)
> 
>                             We have developed a rudimentary LogCloser
>                             implementation that allows us to recover
>                             from this but overall performance is
>                             significantly impacted by this.
> 
>                              > As for the WAL closing issue on GCS, I
>                             recall a previous thread about that
> 
>                             I searched more for this but wasn't able to
>                             find anything, nor similar re: ADL. I am
>                             also curious about the earlier question:
> 
>                             >> Does Accumulo have a specific write pattern [to
WALs], so that file system may not support it?
> 
>                             as discussions with MS engineers have
>                             suggested, similar to the GCS thread, that
>                             small writes at high volume are, at best,
>                             suboptimal for ADLS.
> 
>                             Regards
> 
>                             Stephen
> 
>                             On Wed, Jun 20, 2018 at 11:20 AM,
>                             Christopher <ctubbsii@apache.org
>                             <mailto:ctubbsii@apache.org>> wrote:
> 
>                                 For what it's worth, this is an Apache
>                                 project, not a Sqrrl project. Amazon is
>                                 free to contribute to Accumulo to
>                                 improve its support of their platform,
>                                 just as anybody is free to do. Amazon
>                                 may start contributing more as a result
>                                 of their acquisition... or they may not.
>                                 There is no reason to expect that their
>                                 acquisition will have any impact
>                                 whatsoever on the platforms Accumulo
>                                 supports, because Accumulo is not, and
>                                 has not ever been, a Sqrrl project
>                                 (although some Sqrrl employees have
>                                 contributed), and thus will not become
>                                 an Amazon project. It has been, and will
>                                 remain, a vendor-neutral Apache project.
>                                 Regardless, we welcome contributions
>                                 from anybody which would improve
>                                 Accumulo's support of any additional
>                                 platform alternatives to HDFS, whether
>                                 it be GCS, S3, or something else.
> 
>                                 As for the WAL closing issue on GCS, I
>                                 recall a previous thread about that... I
>                                 think a simple patch might be possible
>                                 to solve that issue, but to date, nobody
>                                 has contributed a fix. If somebody is
>                                 interested in using Accumulo on GCS, I'd
>                                 like to encourage them to submit any
>                                 bugs they encounter, and any patches (if
>                                 they are able) which resolve those bugs.
>                                 If they need help submitting a fix,
>                                 please ask on the dev@ list.
> 
> 
> 
>                                 On Wed, Jun 20, 2018 at 8:21 AM Geoffry
>                                 Roberts <threadedblue@gmail.com
>                                 <mailto:threadedblue@gmail.com>> wrote:
> 
>                                     Maxim,
> 
>                                     Interesting that you were able to
>                                     run A on GCS.  I never thought of
>                                     that--good to know.
> 
>                                     Since I am now an AWS guy (at least
>                                     or the time being), in light of the
>                                     fact that Amazon purchased Sqrrl,  I
>                                     am interested to see what develops.
> 
> 
>                                     On Wed, Jun 20, 2018 at 5:15 AM,
>                                     Maxim Kolchin <kolchinmax@gmail.com
>                                     <mailto:kolchinmax@gmail.com>> wrote:
> 
>                                         Hi Geoffry,
> 
>                                         Thank you for the feedback!
> 
>                                         Thanks to [1, 2], I was able to
>                                         run Accumulo cluster on Google
>                                         VMs and with GCS instead of
>                                         HDFS. And I used Google Dataproc
>                                         to run Hadoop jobs on Accumulo.
>                                         Almost everything was good until
>                                         I've not faced some connection
>                                         issues with GCS. Quite often,
>                                         the connection to GCS breaks on
>                                         writing or closing WALs.
> 
>                                         To all,
> 
>                                         Does Accumulo have a specific
>                                         write pattern, so that file
>                                         system may not support it? Are
>                                         there Accumulo properties which
>                                         I can play with to adjust the
>                                         write pattern?
> 
>                                         [1]:
>                                         https://github.com/cybermaggedon/accumulo-gs
>                                         [2]:
>                                         https://github.com/cybermaggedon/accumulo-docker
> 
>                                         Thank you!
>                                         Maxim
> 
>                                         On Tue, Jun 19, 2018 at 10:31 PM
>                                         Geoffry Roberts
>                                         <threadedblue@gmail.com
>                                         <mailto:threadedblue@gmail.com>>
>                                         wrote:
> 
>                                             I tried running Accumulo on
>                                             Google.  I first tried
>                                             running it on Google's
>                                             pre-made Hadoop.  I found
>                                             the various file paths one
>                                             must contend with are
>                                             different on Google than on
>                                             a straight download from
>                                             Apache.  It seems they moved
>                                             things around.  To counter
>                                             this, I installed my own
>                                             Hadoop along with Zookeeper
>                                             and Accumulo on a
>                                             Google node.  All went well
>                                             until one fine day when I
>                                             could no longer log in.  It
>                                             seems Google had pushed out
>                                             some changes over night that
>                                             broke my client side Google
>                                             Cloud installation. 
>                                             Google referred the affected
>                                             to a lengthy,
>                                             easy-to-make-a-mistake
>                                             procedure for resolving the
>                                             issue.
> 
>                                             I decided life was too short
>                                             for this kind of thing and
>                                             switched to Amazon.
> 
>                                             On Tue, Jun 19, 2018 at 7:34
>                                             AM, Maxim Kolchin
>                                             <kolchinmax@gmail.com
>                                             <mailto:kolchinmax@gmail.com>>
>                                             wrote:
> 
>                                                 Hi all,
> 
>                                                 Does anyone have
>                                                 experience running
>                                                 Accumulo on top of
>                                                 Google Cloud Storage
>                                                 instead of HDFS? In [1]
>                                                 you can see some details
>                                                 if you never heard about
>                                                 this feature.
> 
>                                                 I see some discussion
>                                                 (see [2], [3]) around
>                                                 this topic, but it looks
>                                                 to me that this isn't as
>                                                 popular as, I believe,
>                                                 should be.
> 
>                                                 [1]:
>                                                 https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage
>                                                 [2]:
>                                                 https://github.com/apache/accumulo/issues/428
>                                                 [3]:
>                                                 https://github.com/GoogleCloudPlatform/bigdata-interop/issues/103
> 
>                                                 Best regards,
>                                                 Maxim
> 
> 
> 
> 
>                                             -- 
>                                             There are ways and there are
>                                             ways,
> 
>                                             Geoffry Roberts
> 
> 
> 
> 
>                                     -- 
>                                     There are ways and there are ways,
> 
>                                     Geoffry Roberts
> 
> 
> 

Mime
View raw message