hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan A. P. Pendleton" ...@geekdom.net>
Subject Re: s3
Date Mon, 08 Jan 2007 22:05:27 GMT
S3 has a lot of somewhat weird limits right now, which make some of this
tricky for the common case. Files can only be stored as a single s3 object
if they are less than 5gb, and not 2gb-4gb in size, for instance.

In any case, I'd vote for not segmenting these cases, and using something
like the metadata on the uploaded object to tell between "its a full object"
and "its an inode, listing blocks". Another thing that would be handy would
be naming the blocks as a variant on the inode name, so that it's possible
to "clean up" from erroneous conditions without having to read the full list
of files, and so that there's an implicit link between an inode's filename
and the blocks that it stored.

On 1/8/07, Doug Cutting <cutting@apache.org> wrote:
> Tom White wrote:
> > This sounds like a good plan. I wonder whether the existing
> > block-based s3 scheme should be renamed (as s3block or similar) so s3
> > is the scheme that sores raw files as you describe?
> Perhaps "s3fs" would be best for the full FileSystem implementation, and
> simply "s3" for direct HTTP access?
> Doug

Bryan A. P. Pendleton
Ph: (877) geek-1-bp

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message