Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of amits@infolinks.com
 designates 207.126.144.137 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <CBFB5AD3.2E352%doug.meil@explorysmedical.com>
References: 
 <CAAMYKhqPskZJh0Jhkj6bOaRJkDsWOa0J6CCioqaqdXiqyYSo3A@mail.gmail.com>
	<CBFB5AD3.2E352%doug.meil@explorysmedical.com>
Date: Sat, 16 Jun 2012 17:17:50 +0300
Message-ID: 
 <CAAMYKhr_sGEQUGc7m8k4Mx0oX0oytKVLDvRj1=f3ED8YV1PJTw@mail.gmail.com>
Subject: Re: The write process in the Region Server
From: Amit Sela <amits@infolinks.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=14dae93405d7125b1204c2979985

--14dae93405d7125b1204c2979985
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Thanks Doug, I read the regions section from the book like you recommended
but I still have some questions left.

When running a massive write job, the regionserver log show the memsize
that is flushed. The problem is that most of the time the memsize is either
much smaller then the memstore.flush.size configured (resulting in writing
more files, which leads to frequent compactions) or bigger
than memstore.flush.size * memstore.block.multiplier (resulting in Blocking
updates for 'IPC Server handler # on <port>...).
In some cases I also see HBaseServer throwing a ClosedChannelException:
"WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler #> on
<port #> caught: java.nio.channels.ClosedChannelException"

I guess these problems are also the cause for long (few minutes) pauses and
in extreme cases Full GC during the write jobs.

Any ideas anyone ?

In general, I did some digging and couldn't find much about the write
process in HBase from a "memory usage" point of view... besides the
configurations description - maybe worth adding to the book.

Thank you for all your help,

Amit.


On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil <doug.meil@explorysmedical.com>w=
rote:

>
> Hi there-
>
> Your understanding is on track.
>
>
> You probably want to read this section..
>
> http://hbase.apache.org/book.html#regions.arch
>
> =C5=A0 as it covers those topics in more detail.
>
>
>
>
> On 6/10/12 1:02 PM, "Amit Sela" <amits@infolinks.com> wrote:
>
> >Hi all,
> >
> >I'm trying to better understand what's going on in the region server
> >during
> >write to HBase.
> >
> >As I understand the process:
> >
> >1. Data is written to memstore.
> >2. Once the memstore has reached hbase.hregion.memstore.flush.size ->
> >memstore executes flush and writes a new StoreFile.
> >3. The number of StoreFiles increases until a compaction is triggered.
> >
> >To my understanding, the compaction is triggered after a compaction chec=
k
> >is done by either CheckCompaction thread running in the background or by
> >the flush memstore executed.
> >The compaction triggered will be a minor compaction BUT it could promote
> >to
> >major if it includes all store files.
> >When will it NOT include all store files ? say I set compactionThreshld =
to
> >3, then when the 3rd (or 4th) flush is executed, a compaction wiil be
> >triggered and will promote to major since it includes all store files.
> >
> >Is this right ? can anyone elaborate ?
>
>
>

--14dae93405d7125b1204c2979985--