accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chutium <teng...@gmail.com>
Subject Re: Accumulo on s3
Date Tue, 26 Apr 2016 09:12:33 GMT
Hi Josh,

about the guarantees of s3, according to this doc from amazon:
https://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-plan-consistent-view.html

> Amazon S3 buckets in xxx, xxx regions provide read-after-write consistency
> for put requests of new objects and eventual consistency for overwrite put
> and delete requests.

so may be accumulo will get problem with consistency only by major
compactions right? it seems no other operation is overwriting or deleting
files on HDFS.

let me describe our usage of accumulo on s3, basically, we want to combine
the unlimited storage feature of s3 and the fine grained access control
provided by accumulo.

we are using "accumulo on s3" as a secured storage behind data processing
engine (spark), data are ingested into accumulo regularly, not in real time
(no single put, batch ingestion each X hours), most of data access use cases
are batch processing, so no realtime read or write.

then consistency or sync will still be a problem or not?

I added some thoughts of mine in that stackoverflow thread:
http://stackoverflow.com/a/36845743/5630352 ,  I really want to know is this
possible to solve the s3 problem for our use case? because it seems until
now, no other tools can provide such a powerful access control framework
like accumulo.

Thanks!



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Accumulo-on-s3-tp16737p16764.html
Sent from the Developers mailing list archive at Nabble.com.

Mime
View raw message