hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: hbase and hadoop (for "normal" hdfs) cluster together?
Date Thu, 31 Jul 2014 16:08:44 GMT
What's the read / write mix in your workload ?

Have you looked at HBASE-10070 'HBase read high-availability using
timeline-consistent region replicas' (phase 1 has been merged for the
upcoming 1.0 release) ?


On Thu, Jul 31, 2014 at 8:17 AM, Wilm Schumacher <wilm.schumacher@cawoom.com
> wrote:

> Hi,
> I have a "conceptional" question and would appreciate hints.
> My task is to save files to hdfs and to maintain some informations about
> them in a hbase db and then serve both to the application.
> Per file I have around 50 rows with 10 columns (in 2 column families) in
> the tables, which have string values of length around 100.
> The files have normal size (perhaps between some kB to 100 MB or so).
> By this estimation the number of files are way smaller than the the
> number of rows (times columns), but the space on disk is way larger for
> the files than the space for the hbase. I would further estimate, that
> for every get on a file there should be around hundreds of getRows on
> the hbase.
> For the files I want to run an hadoop cluster (obviously). The question
> now arises: should I run the hbase on the same hadoop cluster?
> The pro of running together is obvious: i would only have to run one
> hadoop cluster which would which would save time, money and nerves.
> On the other hand it wouldn't be possible to make special adjustments
> for optimizing the cluster for one or the other task. E.g. if I want to
> make the hbase more "distributed" by optimizing the replication (to
> let's say 6) I would have to use a doubled amount of disk for the
> "normal" files, too.
> So: what should I do?
> Do you have any comments or hints on this question
> Best wishes,
> wilm

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message