accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hughes <jn...@virginia.edu>
Subject Re: Accumulo on Azure / WebHDFS
Date Mon, 17 Apr 2017 21:01:33 GMT
Thanks.  Can you say if the performance is on par with a cloud you might
otherwise spin-up?

In terms of the drop-in bits, it is as easy as setting 'instance.volumes'
to point at the new URL?

Thanks!

On Mon, Apr 17, 2017 at 4:57 PM, Josh Elser <josh.elser@gmail.com> wrote:

> I don't have any performance numbers handy. I'm not sure if
> Microsoft/Azure-team publishes them.
>
> In general, my understanding is that each of them are intended to be
> "drop-in replacements". There might be some implementation specific
> configuration (e.g. account/billing), but that's it.
>
> James Hughes wrote:
>
>> Hi Josh,
>>
>> Thanks again!
>>
>> As a follow-up, is any of the information about Accumulo on WASB or ADL
>> public?  I suppose I'm curious about configuration (is it just
>> plug-and-play?) and performance.
>>
>> Thanks in advance,
>>
>> Jim
>>
>> On Sat, Apr 15, 2017 at 2:25 PM, Josh Elser <josh.elser@gmail.com
>> <mailto:josh.elser@gmail.com>> wrote:
>>
>>     As I understand it, S3 is currently still a non-starter.
>>
>>     Long term, Amazon may provide some more features to fix the sync
>>     issue. Or, someone can modify Accumulo to support putting rfiles on
>>     s3 exclusively.
>>
>>     Happy to expand on this further if you're curious.
>>
>>
>>     On Apr 14, 2017 15:16, "James Hughes" <jnh5y@virginia.edu
>>     <mailto:jnh5y@virginia.edu>> wrote:
>>
>>         Hi Josh,
>>
>>         Thanks!  Sounds like Azure's offerings are providing better
>>         performance and sync()'ing over S3?  (I.e., is S3 still a no-go
>>         for Accumulo?)
>>
>>         Your description of WebHDFS makes totally sense.  I figured
>>         there may be an outside chance that WebHDFS handled or worked
>>         around limitations from S3, etc.
>>
>>         Cheers,
>>
>>         Jim
>>
>>         On Fri, Apr 14, 2017 at 12:47 PM, Josh Elser
>>         <josh.elser@gmail.com <mailto:josh.elser@gmail.com>> wrote:
>>
>>             Hi Jim,
>>
>>             I can say that Accumulo will work on Azure's blob store and
>>             their data
>>             lake store. These are a result of testing I'm involved with at
>>             Hortonworks (dayjob). I know that these filesystems are
>>             tested to an
>>             appropriate degree, proving that they do provide the things
>> that
>>             Accumulo needs.
>>
>>             As a refresher, the things we need from a filesystem are:
>>             performance
>>             (Accumulo's write performance is pretty dominated by I/O) and
>>             durability guarantees (when we call sync() on a file, the
>>             data we just
>>             wrote better be there).
>>
>>             For WebHDFS, I think you would both hurt for performance and
>>             I would
>>             be surprised if it actually provided the durability
>>             correctness. My
>>             understanding is that WebHDFS is more meant to allow
>>             non-Java clients
>>             easy access to HDFS (as a one-off) rather than act as a
>>             fully-fledged
>>             access layer.
>>
>>             - Josh
>>
>>             On Fri, Apr 14, 2017 at 10:16 AM, James Hughes
>>             <jnh5y@virginia.edu <mailto:jnh5y@virginia.edu>> wrote:
>>              > Hi all,
>>              >
>>              > I know folks have asked about Accumulo on S3 before (1).
>>              >
>>              > Has anyone tried running Accumulo on Azure's blob storage
>>             or data lake
>>              > solutions (2)?  (Or perhaps more generally, has anyone
>>             tried Accumulo on
>>              > WebHDFS?)
>>              >
>>              > As more background, I have deployed Accumulo on HDP
>>             clouds in Azure, and
>>              > that works great.  I'm interested in using the blob /
>>             data lake storage for
>>              > benefits with scaling, etc.
>>              >
>>              > Thanks in advance,
>>              >
>>              > Jim
>>              >
>>              > 1.
>>             http://apache-accumulo.1065345.n5.nabble.com/Accumulo-on-s3-
>> td16737.html
>>             <http://apache-accumulo.1065345.n5.nabble.com/Accumulo-on-
>> s3-td16737.html>
>>              > 2.
>>              >
>>             https://docs.microsoft.com/en-us/azure/data-lake-store/data-
>> lake-store-integrate-with-other-services
>>             <https://docs.microsoft.com/en-us/azure/data-lake-store/data
>> -lake-store-integrate-with-other-services>
>>
>>
>>
>>
>>

Mime
View raw message