Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of jdcryans@gmail.com designates
 74.125.92.24 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:sender:to:subject:in-reply-to:mime-version
         :content-type:references:x-google-sender-auth;
        b=q3U5lIIZUMLO8P+tQvTt4YKbr84vC0+xRP99AdomqnityJUrVYnvR+yKGly6MMcn2j
         IYr+PgWsXboM47nioxZnq2xB9fY4VuAnQxHr/VyKRkw4b1ydz+AB+oQ2ms1vOTE5ZDui
         qz44/rWWigucysFCYh3EvWDtrOWD9nMgzr3uQ=
Message-ID: <31a243e70812191246h591fbf0eq6f7fdcfce68fc057@mail.gmail.com>
Date: Fri, 19 Dec 2008 15:46:18 -0500
From: "Jean-Daniel Cryans" <jdcryans@apache.org>
Sender: jdcryans@gmail.com
To: hbase-user@hadoop.apache.org
Subject: Re: HBase behaviour at startup (compression)
In-Reply-To: <494BFE77.9050604@duboce.net>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_31983_13194544.1229719578664"
References: <21051218.post@talk.nabble.com> <49497D94.5060608@duboce.net>
	 <21094878.post@talk.nabble.com> <494BFE77.9050604@duboce.net>

------=_Part_31983_13194544.1229719578664
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

I think we should cut a 0.18.2. it also contains a backport of 1046.

Jean-Adrien, regards your problem, you also have to take into account the
hbase.regionserver.thread.splitcompact.check.frequency. When a region opens,
as we know, a compaction is requested for it. The class responsible for
checking that is CompactSplitThread and, by default, it starts a compaction
every 20 seconds. So even if a compaction takes 0 sec, you still lose 20
seconds for each and every one of them. In your particular situation, given
the fact that you have 250 regions per region server, it's easy to
understand why it may be long.

Now to fix that, you could have a third region server on your machine that
only has 1 datanode. You could also lower the value of the config I wrote
about to maybe 10 seconds.

J-D

On Fri, Dec 19, 2008 at 3:05 PM, stack <stack@duboce.net> wrote:

> Jean-Adrien wrote:
>
>> Hello,
>>
>> Andrew and St.ack, Thanks for your answers, and excuse me for confusion
>> between compression and compaction...
>>
>> I reviewed the concept of major / minor compaction in the wiki and I
>> looked
>> at both jira cases HBASE-938 / HBASE-1062.
>> Since I'm running hbase version 0.18.0 I certainly have the problem of
>> HBASE-938. If I understand well the problem, it is that at startup, all
>> opened regions that need compaction make a major compaction since the
>> timestamp of the latest major is not stored anywhere, so the (in memory)
>> counter is reset to the startup time, and the next major compaction will
>> take place (with default config) 1 day later.
>>
>>
>
> I say that a major compaction runs on every restart in HBASE-938 but I was
> incorrect.  Later in the issue, I recant having studied the code (The 'last'
> major compaction timestamp is that of the oldest file in the filesystem).
>
> Later in hbase-938, we hone in on the fact that even in case where the last
> compaction was a major compaction, if the major compaction interval elapses,
> we'd run a new major compaction.  Essentially we'd rewrite data in hbase on
> a period (As you 'prove' later in this message w/ your replication check
> (S).
>
> Can you tell what is running on restart?  Is it a major compaction?  Or add
> logs of startup to an issue and I'll take a look.  In 0.18.x, there is the
> below if a 'major':
>
>
>           LOG.debug("Major compaction triggered on store: " +
> this.storeNameStr +
>             ". Time since last major compaction: " +
>             ((System.currentTimeMillis() - lowTimestamp)/1000) + "
> seconds");
>
> The thing I'm not clear on is why on restart all the compacting?  Why is a
> 'major' compaction triggered if we're looking at timestamp of oldest file in
> filesystem.  Perhaps you can add some debug emissions to figure it
> Jean-Adrien?
> ...
>
>> Here can be my problem during major compaction:
>> I think, (I'm not sure, I have to find better tool to monitor my network)
>> with my light configuration (see above for details), the problem is that
>> even if the compaction process is quick, for example a single modification
>> in a cell yield to a major compaction rewriting the whole file, since my
>> regionservers run on the same machine than the datanodes, they communicate
>> directly (fast) when RS ask to store a mapfile to DN.
>> Then the datanode will place replicas of the blocks on the 2 others
>> datanodes through the slow 100Mbit/s network. At HBase startup time, if
>> hadoop asks the network to transfer about 200Gb the bandwidth might be
>> saturated. The lease expires and the RS shut themself done. That could
>> explain as well the problem of max Xcievers reached sometime in the
>> datanodes that we disscussed in a previous post.
>>
>>
>>
> Above sounds plausible.
>
> Should we cut a 0.18.2 with hbase-938 backported (includes other good fixes
> too -- hbase-998, etc.).
>
> St.Ack
>

------=_Part_31983_13194544.1229719578664--