hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan A. P. Pendleton" ...@geekdom.net>
Subject Re: s3
Date Tue, 09 Jan 2007 21:28:18 GMT
On 1/9/07, Tom White <tom.e.white@gmail.com> wrote:
> > S3 has a lot of somewhat weird limits right now, which make some of this
> > tricky for the common case. Files can only be stored as a single s3
> object
> > if they are less than 5gb, and not 2gb-4gb in size, for instance.
> Strange - is this a bug Amazon are fixing do you know?

I believe so, but they haven't promised when. I should note that I haven't
test that this occurs, just read that it does on several threads on the
Amazon S3 forums.

> > Another thing that would be handy would
> > be naming the blocks as a variant on the inode name, so that it's
> possible
> > to "clean up" from erroneous conditions without having to read the full
> list
> > of files, and so that there's an implicit link between an inode's
> filename
> > and the blocks that it stored.
> I'm reluctant to name blocks as a variant of the file name, unless we
> want to not support renames. I think a fsck tool would meet your
> requirement to clean up from erroneous conditions.

A fsck tool would be sufficient, I think, though it wouldn't be possible to
"fix up" without enumerating a lot of the metadata of a collection. Seems
like a trade-off that's fine to make - as long as there's an fsck on the

Bryan A. P. Pendleton
Ph: (877) geek-1-bp

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message