hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagane Sundar <jag...@apache.org>
Subject Re: Which proposed distro of Hadoop, 0.20.206 or 0.22, will be better for HBase?
Date Thu, 06 Oct 2011 05:40:53 GMT
On Wed, Oct 5, 2011 at 10:09 PM, Konstantin Boudnik <cos@apache.org> wrote:

> On Wed, Oct 05, 2011 at 07:00PM, Jagane Sundar wrote:
> > approaches you are familiar with. Chef/Puppet et. al. are not interesting
> to
> Is this a technical lack of interest as in these solutions do not perform
> as
> you expect them or this is a policy thing of some kind?

No policy or anything of that sort. It's a personal preference. Chef,
puppet, etc. are not full feedback systems. They keep doing the same thing
over and over again trying to to get the system into a 'desired' state. A
state machine driven full feedback system works better. When things go
wrong, that information can be acted upon.

> > turned out to be slow as sh**, they seem to have hacked the HDFS layer
> some
> > more, in order to actually have a NameNode for metadata, but to use S3
> for
> > storing blocks. They have a protocol s3 to access this. Both of these
> > approaches have one severe failing - they do not support Append and
> Hflush.
> > ergo - no HBase on EMR. I am sure they are working furiously to address
> this
> I wonder if you can delve into these details: is it an inherit problem of
> s3
> protocol or something irrelevant to the technicalities?
> I don't know nearly enough. I would speculate that it is because of S3's
roots as a HTTP based system. It was mostly REST and SOAP Apis that S3 used
to publish. I know that people have built full blown FUSE filesystems using
S3 as the backend, but these tend to be used as a replacement for scp and
ftp, but not necessarily for running applications that need full POSIX.
Internally, there are probably other APIs that are available to EMR, but
still, it feels like they may be stressing S3 in ways that are not natural
to it.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message