Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: <CAPd2BNk=Ye3GDFdS3aumkfPcTOS2CvSEF5K56aESEt3aymTRAQ@mail.gmail.com>
References: <CAPd2BNn7Q9XRfjR4bCZHU0SCLe+Q=LheB+JjA+xVYM1LL__5Jg@mail.gmail.com>
 <CAM7-19LvZFktx91W3XCjmFRb2s27TtgHbNAxVcHES8_ZY0DipQ@mail.gmail.com> <CAPd2BNk=Ye3GDFdS3aumkfPcTOS2CvSEF5K56aESEt3aymTRAQ@mail.gmail.com>
From: Anoop John <anoop.hbase@gmail.com>
Date: Tue, 6 Jun 2017 19:58:54 +0530
Message-ID: <CAOtJ30rwi7tkeXrwQ3jCMmmi3b98VSdjSxR3Ptc13S0RFf7HtA@mail.gmail.com>
Subject: Re: Any Repercussions of using Multiwal
To: "user@hbase.apache.org" <user@hbase.apache.org>
Cc: Raghavendra Pandey <raghavendra.pandey@gmail.com>
Content-Type: text/plain; charset="UTF-8"
archived-at: Tue, 06 Jun 2017 14:29:00 -0000

You can config this max WALs. (As said by Yu , hbase.regionserver.maxlogs)
When the total un archived WAL files count exceeds this, we will do
force flushes and so release some of the WALs.   As Yu mentioned, when
we use multi WAL and say we have 2 WAL groups, this WAL count
effectively will be  32 * 2 = 64.   But u can config it to a lower
value than the def 32.

-Anoop-

On Tue, Jun 6, 2017 at 6:12 PM, Sachin Jain <sachinjain024@gmail.com> wrote:
> Thanks Yu!
> It was certainly helpful.
>
>> Regarding the issue you met, what's the setting of
> hbase.regionserver.maxlogs in your env? By default it's 32 which means for
> each RS the un-archived wal number shouldn't exceed 32. However, when
> multiwal enabled, it allows 32 logs for each group, thus becoming 64 wals
> allowed for a single RS.
>
> I used default configuration for this. By multiWal, I understand there is
> different wal per region. Can you please explain how did you get 64 wals
> for a Region Server.
>
>> when multiwal enabled, it allows 32 logs for each group, thus becoming 64
> wals allowed for a single RS.
>
> I thought one of the side effects of having multiwal enabled is that there
> will be *large amount of data waiting in unarchived wals.*
> So if a region server fails, it would take more time to playback the wal
> files and hence it could *compromise Availability.*
>
> Wdyt ?
>
> Thanks
> -Sachin
>
>
> On Tue, Jun 6, 2017 at 2:04 PM, Yu Li <carp84@gmail.com> wrote:
>
>> Hi Sachin,
>>
>> We have been using multiwal in production here in Alibaba for over 2 years
>> and see no problem. Facebook is also running multiwal online. Please refer
>> to HBASE-14457 <https://issues.apache.org/jira/browse/HBASE-14457> for
>> more
>> details.
>>
>> There's also a JIRA HBASE-15131
>> <https://issues.apache.org/jira/browse/HBASE-15131> proposing to turn on
>> multiwal by default but still under discussion, please feel free to leave
>> your voice there.
>>
>> Regarding the issue you met, what's the setting of
>> hbase.regionserver.maxlogs in your env? By default it's 32 which means for
>> each RS the un-archived wal number shouldn't exceed 32. However, when
>> multiwal enabled, it allows 32 logs for each group, thus becoming 64 wals
>> allowed for a single RS.
>>
>> Let me further explain how it leads to RegionTooBusyException:
>> 1. if the number of un-archived wal exceeds the setting, it will check the
>> oldest WAL and flush all regions involved in it
>> 2. if the data ingestion speed is high and wal keeps rolling, there'll be
>> many small hfiles flushed out, that compaction speed cannot catch up
>> 3. when hfile number of one store exceeds the setting of
>> hbase.hstore.blockingStoreFiles (10 by default), it will delay the flush
>> for hbase.hstore.blockingWaitTime (90s by default)
>> 4. when data ingestion continues but flush delayed, the memstore size might
>> exceed the upper limit thus throw RegionTooBusyException
>>
>> Hope these information helps.
>>
>> Best Regards,
>> Yu
>>
>> On 6 June 2017 at 13:39, Sachin Jain <sachinjain024@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I was in the middle of a situation where I was getting
>> > *RegionTooBusyException* with log something like:
>> >
>> >     *Above Memstore limit, regionName = X ... memstore size = Y and
>> > blockingMemstoreSize = Z*
>> >
>> > This potentially hinted me towards *hotspotting* of a particular region.
>> So
>> > I fixed my keyspace partitioning to have more uniform distribution per
>> > region. It did not completely fix the problem but definitely delayed it a
>> > bit.
>> >
>> > Next thing, I enabled *multiWal*. As I remember there is a configuration
>> > which leads to flushing of memstores when the threshold of wal is
>> reached.
>> > Upon doing this, problem seems to go away.
>> >
>> > But, this raises couple of questions
>> >
>> > 1. Are there any reprecussions of using *multiWal* in production
>> > environment ?
>> > 2. If there are no repercussions and only benefits of using *multiWal*,
>> why
>> > is this not turned on by default. Let other consumers turn it off in
>> > certain (whatever) scenarios.
>> >
>> > PS: *Hbase Configuration*
>> > Single Node (Local Setup) v1.3.1 Ubuntu 16 Core machine.
>> >
>> > Thanks
>> > -Sachin
>> >
>>