hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim R. Wilson" <wilson.ji...@gmail.com>
Subject Re: Best practices for HBase in EC2?
Date Sat, 04 Jun 2011 18:49:23 GMT
Thanks Sean,

That's helpful.  I probably should have added some contextual info.  In my
case, I'm interested in providing instructions on how one can fire up an
HBase cluster in EC2 order to experiment with it.  That is, load data,
practice administration, etc.  In that context, it's unlikely that the
person following the instructions would start more that 5 nodes, and would
also not likely keep them on longer than an hour.

I saw archived email threads where people recommended not running on EC2 for
any length of time since you can get better performance-per-cost
characteristics from dedicated hardware (for example from Rackspace).

So I guess my real question is this: What is the easiest possible way to
start a 5-node HBase 0.90.x cluster in EC2?  I'm thinking that S3 is better
for storage, but I'm open to whatever is genuinely the easiest thing to do.

Thanks again,

-- Jim

On Sat, Jun 4, 2011 at 2:40 PM, Sean Bigdatafun
<sean.bigdatafun@gmail.com>wrote:

> Here is my thoughts:
>
> If your datastorage is used for long-term, then you may consider attaching
> HDFS storage device onto EBS rather than local disk (Attaching Namenode
> storage device onto EBS as well). But for this setup, I think we should
> think of dfs.replication.factor=2 (even 1) because EBS itself has already
> provided enough reliability.
>
> If your datastore is used for ephemeral purpose (say EMR computation), you
> may consider just using the EC2 provided ephemeral disks.
>
>
>
>
> On Sat, Jun 4, 2011 at 11:27 AM, Jim R. Wilson <wilson.jim.r@gmail.com
> >wrote:
>
> > Hi HBase community,
> >
> > What are the current best-practices with respect to starting up an HBase
> > cluster in EC2?  I don't see any public AMI's newer than 0.89.xxx, and
> > starting up that one it's, clear that it's not configured for HDFS or
> > clustering (empty hbase-site.xml).
> >
> > Do people generally keep data in S3 or HDFS?  If the latter, is it
> > persisted
> > via EBS?  Do the hadoop nodes have more than one EBS attached to
> > distinguish
> > HDFS from the OS?
> >
> > Any help is much appreciated.  Thanks in advance!
> >
> > -- Jim R. Wilson (jimbojw)
> >
>
>
>
> --
> --Sean
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message