hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject RE: Using HBase on other file systems
Date Sat, 15 May 2010 21:30:49 GMT
No, Todd was not specifying some kind of minimum. The point was the more spindles, the better
for an I/O parallel architecture like HDFS and BigTable. Have you read the BigTable paper?

   - Andy

> From: Gibbon, Robert, VF-Group
> Subject: RE: Using HBase on other file systems
> 
> Todd thanks for replying. 4x 7200 spindles and no RAID =
> approx 360 IOPS to/from the backend storage, minimum and per
> node to run an HBase cluster.
> 
> Right?
> 
> cheers
> Robert
> 
> -----Original Message-----
> From: Todd Lipcon [mailto:todd@cloudera.com]
> Sent: Sat 5/15/2010 3:51 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Using HBase on other file systems
>  
> On Fri, May 14, 2010 at 2:15 PM, Gibbon, Robert, VF-Group
> <
> Robert.Gibbon@vodafone.com>
> wrote:
> 
> > Hmm. What level of IOPs does Hbase need in order to
> support a reasonably
> > responsive level of service? How much latency in
> transfer times is
> > acceptable before the nodes start to fail? Do you use
> asynchronous IO
> > queueing? Write-through caching? Prefetching?
> >
> >
> Hi Robert. Have you read the Bigtable paper? It's a good
> description of the
> general IO architecture of BigTable. You can also read the
> original paper on
> Log-structured merge tree storage from back in the 90s.
> 
> To answer your questions in brief:
> - Typical clusters run on between 4 and 12x 7200RPM SATA
> disks. Some people
> run on 10k disks to get more random reads per second, but
> not necessary
> - latency in transfer times is a matter of what your
> application needs, not
> a matter of what HBase needs.
> - no, we do not asynchronously queue reads - AIO support is
> lacking in Java
> 6 and even in the current previews of Java7 it is a thin
> wrapper around
> threadpools and synchronous IO APIs.
> - HBases uses log-structured storage, which is somewhat the
> same as
> write-through caching in a way. We never do random-writes
> (in fact they're
> impossible in HDFS)
> 
> -Todd
> 
> 
> >
> > On Fri, May 14, 2010 at 12:02 PM, Gibbon, Robert,
> VF-Group <
> > Robert.Gibbon@vodafone.com>
> wrote:
> >
> > >
> > > My thinking is around separation of concerns - at
> an OU level not just at
> > a
> > > system integration level. Walrus gives me a
> consistent, usable
> > abstraction
> > > layer to transparently substitute the storage
> implementation - for
> > example
> > > from symmetrix <--> isilon or anything in
> between. Walrus is storage
> > > subsystem agnostic, so it need not be configured
> for inconsistency like
> > the
> > > Amazon service it emulates.
> > >
> > > Tight coupling for lock-in is a great commercial
> technique often seen
> > with
> > > suppliers. But it is a bad one. Very bad.
> > >
> >
> > However, reasonably tight coupling between a database
> (HBase) and its
> > storage layer (HDFS) is IMHO absolutely necessary to
> achieve a certain
> > level
> > of correctness and performance. In HBase's case we use
> the Hadoop
> > FileSystem
> > interface, so in theory it will work on anyone who has
> implemented said
> > interface, but I wouldn't run a production instance on
> anything but HDFS.
> >
> > It's worth noting that most commercial databases
> operate on direct block
> > devices rather than on top of filesystems, so that
> they don't have to deal
> > with varying semantics/performance between
> ext3,ext4,xfs,ufs, myriad other
> > single-node filesystems that exist.
> >
> > -Todd
> >
> >
> > >
> > >
> > > -----Original Message-----
> > > From: Andrew Purtell [mailto:apurtell@apache.org]
> > > Sent: Thu 5/13/2010 11:54 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: RE: Using HBase on other file systems
> > >
> > > You really want to run HBase backed by
> Eucalyptus' Walrus? What do you
> > have
> > > behind that?
> > >
> > > > From: Gibbon, Robert, VF-Group
> > > > Subject: RE: Using HBase on other file
> systems
> > > [...]
> > > > NB. I checked out running HBase over Walrus
> (an AWS S3
> > > > clone): bork - you want me to file a Jira on
> that?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
> >
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
> 
> 


      


Mime
View raw message