hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Puri, Aseem" <Aseem.P...@Honeywell.com>
Subject RE: Some HBase FAQ
Date Tue, 14 Apr 2009 06:50:36 GMT
Hi Ryan,

It means Regionserver have only index file of regions but not the actual
data that is on HDFS.

Thanks & Regards
Aseem Puri

-----Original Message-----
From: Ryan Rawson [mailto:ryanobjc@gmail.com] 
Sent: Tuesday, April 14, 2009 12:16 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Some HBase FAQ

HBase loads the index of the files on start-up, if you ran out of memory
for
those indexes (which are a fraction of the data size), you'd crash with
OOME.

The index is supposed to be a smallish fraction of the total data size.

I wouldn't run with less than -Xmx2000m

On Mon, Apr 13, 2009 at 10:48 PM, Puri, Aseem
<Aseem.Puri@honeywell.com>wrote:

>
> -----Original Message-----
> From: Erik Holstad [mailto:erikholstad@gmail.com]
> Sent: Monday, April 13, 2009 9:47 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Some HBase FAQ
>
> On Mon, Apr 13, 2009 at 7:12 AM, Puri, Aseem
> <Aseem.Puri@honeywell.com>wrote:
>
> > Hi
> >
> >            I am new HBase user. I have some doubts regards
> > functionality of HBase. I am working on HBase, things are going fine
> but
> > I am not clear how are things happening. Please help me by answering
> > these questions.
> >
> >
> >
> > 1.      I am inserting data in HBase table and all regions get
> balanced
> > across various Regionservers. But what will happens when data
> increases
> > and there is not enough space in Regionservers to accommodate all
> > regions. So I will like this that some regions in Regionserver and
> some
> > are at HDFS but not on Regionserver or HBase Regioservers stop
taking
> > new data?
> >
> Not really sure what you mean here, but if you are asking what to do
> when
> you are
> running out of disk space on the regionservers, the answer is add
> another
> machine
> or two.
>
> --- I want ask that HBase RegionServer store regions data on HDFS. So
> when HBase master starts it loads all region data from HDFS to
> regionserver. So what will the scenario if there is not enough space
in
> regionservers to accommodate new data? Is some regions swapped out
from
> regionserver to create space for new regions and when needed swaps in
> regions to regionserver from HDFS. Or something else will happen.
>
> >
> >
> >
> > 2.      When I insert data in HBase table, 3 to 4 mapfiles are
> generated
> > for one category, but after some time all mapfiles combines as one
> file.
> > Is this we call minor compaction actually?
> >
> When all current mapfiles and memcache are combined into one files,
this
> is called major compaction, see BigTable paper for more details.
>
> >
> >
> >
> > 3.      For my application where I will use HBase will have updates
in
> a
> > table frequently. Should is use some other database as a
intermediate
> to
> > store data temporarily like MySQL and then do bulk update on HBase
or
> > should I directly do updates on HBase. Please tell which technique
> will
> > be more optimized in HBase?
> >
> HBase is fast for reads which has so far been the main focus of the
> development, with
> 0.20 we can hopefully add even fast random reading to it to make it a
> more
> well rounded
> system. Is HBase too slow for you today when writing to it and what
are
> your
> requirements?
>
> ---- Basically I put this question for writing operation. Not any
> complex requirement. I want your suggestion on that what technique
> should I follow for write operation:
>
> a. If there is some update I should store data temporarily in MySQL
and
> then do bulk update on HBase
>
> b. As if there is an update I should directly update on HBase instead
of
> writing it in MySQL and after some time doing bulk update on HBase.
>
> What you say, what approach is more optimized?
>

Mime
View raw message