hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Region is out of bounds
Date Mon, 08 Dec 2014 19:33:32 GMT
Forgot to mention:

 Low (default) value for *hbase.hstore.compaction.max* and prolonged write
activity w/o throttling will get you into this *region is out of bounds*
situation. Because, when compaction can't keep up with writes, number of
store files eventually exceeds *hbase.hstore.compaction.max. *Here what is
happening in 0.94:

0.94 does not enforce major compaction flag on selections with reference
files, applies selection algorithms and applies the limit by removing *first
K = N - max * files from candidate list. N - # of files in selection list,
*max* - is  *hbase.hstore.compaction.max *value*.*


0.98:

It marks selection as major but after that applies limit and  removes first *K
= N - max * files from a candidate list.

In both cases, if # of references files > max, some reference files will be
excluded.

What happens later, depends on what we have in a compaction file list for
this Store (HStore). If pending files list has at least one non-reference
file, all reference files will be excluded from above selection.

What I have to say here, it seems that the only way to compact all
reference files is to enforce major compaction immediately after region
split. If we fail to do this, with a very high probability, reference files
will be pushing out of compaction until the write load decrease
substantially and # of store files becomes less than
*hbase.hstore.compaction.max.*

-Vlad

On Mon, Dec 8, 2014 at 11:14 AM, Vladimir Rodionov <vladrodionov@gmail.com>
wrote:

> Yes, we have a patch in house. Just need time to verify it. As for seq ##,
>  lowest seq ## is the reason that reference files are constantly getting
> removed from compaction selections if there are newer files in a compaction
> queue. Just check the code. This is what is happening under high load when
> there are too many minor compaction requests in a queue, reference files do
> not have a chance to be compacted.
>
> Interestingly, that current 0.94 and 0.98 code have different issues here
> and require different patches.
>
> 0.94 does not treat compaction request with reference files as major one,
> but ignores *hbase.hstore.compaction.max *for major compactions,
> 0.98 consider compaction of reference files as a major one, but consults  *hbase.hstore.compaction.max
> *and downgrades request when # of files exceeds this limit*.*
>
> *-*Vlad
>
>
> On Mon, Dec 8, 2014 at 10:41 AM, lars hofhansl <larsh@apache.org> wrote:
>
>> Did you get anywhere?
>> Happy to collaborate, this is important to fix.
>>
>> Thinking about my comment again. The half-store files would have an
>> earlier sequence number than any new files, so they would naturally sort
>> first. This needs a bit more investigation.
>>
>> -- Lars
>>
>>   ------------------------------
>>  *From:* Vladimir Rodionov <vladrodionov@gmail.com>
>> *To:* "dev@hbase.apache.org" <dev@hbase.apache.org>
>> *Cc:* lars hofhansl <larsh@apache.org>
>> *Sent:* Friday, December 5, 2014 10:40 PM
>>
>> *Subject:* Re: Region is out of bounds
>>
>> Under heavy load, the only window for compaction of reference files is
>> the first compaction request after split (when Store's filesInCompactions
>> is empty and major compaction is possible) If this major compaction request
>> fails or downgrades to minor ( i.e. # of store files > # of max files per
>> compaction) , there are very high probability that region will never be
>> split until load decreases substantially. In this case, compaction queue
>> (list of files being compacted per Store file) will be most of the time
>> empty, and major compaction (including all reference files) will get a
>> chance.
>>
>> If region has reference files  - its not splittable.
>>
>> I have a very simple patch I am going to submit this weekend.
>>
>> -Vladimir Rodionov
>>
>>
>>
>> On Fri, Dec 5, 2014 at 7:12 PM, <tianq01@gmail.com> wrote:
>>
>> Good points Lars. I thought a bit on how to debug/find more
>> clue...sorting it first is a good idea(currently sort by sequenceID, size
>> etc)
>> Thanks
>>
>> 发自我的 iPad
>>
>> 在 2014-12-6,9:07,Andrew Purtell <apurtell@apache.org> 写道:
>>
>> >> Seems to me we should sort reference files first _always_, to compact
>> > them away first and allow to be split further. Thoughts? File a jira?
>> >
>> > ​Sounds reasonable as an enhancement issue​
>> >
>> >
>> > On Fri, Dec 5, 2014 at 5:02 PM, lars hofhansl <larsh@apache.org> wrote:
>> >
>> >> Digging in the (0.98) code a bit I find this:
>> >>
>> >> HRegionServer.postOpenDeployTasks(): Request a compaction either when
>> >> we're past the minimum of file or there is any reference file. Good,
>> that
>> >> will trigger
>> >> RatioBasedComactionPolicy.selectCompaction(): turns any compaction
>> into a
>> >> major one if there reference files involved in the set of files already
>> >> selected. Also cool, after a split all files of a daughter will be
>> >> reference files.
>> >>
>> >> But... I do not see any code where it would make sure at least one
>> >> reference file is selected. So in theory the initial compaction
>> started by
>> >> postOpenDeployTasks could have failed for some reason.Now more data is
>> >> written and following compaction selections won't pick up any reference
>> >> files, as there are many small, new files written.
>> >> So the reference files could in theory just linger until a selection
>> just
>> >> happens to come across one, all the while the daughters are (a)
>> >> unsplittable and (b) cannot migrate to another region server.
>> >> That is unless I missed something... Maybe somebody could have a look
>> too?
>> >> Seems to me we should sort reference files first _always_, to compact
>> them
>> >> away first and allow to be split further. Thoughts? File a jira?
>> >>
>> >> -- Lars
>> >>
>> >>      From: lars hofhansl <larsh@apache.org>
>> >> To: "dev@hbase.apache.org" <dev@hbase.apache.org>
>> >> Sent: Friday, December 5, 2014 1:59 PM
>> >> Subject: Re: Region is out of bounds
>> >>
>> >> We've run into something like this as well (probably).
>> >> Will be looking at this as well over the next days/weeks. Under heavy
>> load
>> >> HBase seems to just not able to get the necessary compactions in, and
>> until
>> >> that happens it cannot further split a region.
>> >>
>> >> I wonder whether HBASE-12411 would be help here (this optionally allow
>> >> compaction to use private readers), I doubt it, though.
>> >> The details are probably tricky, I thought HBase would compact split
>> >> regions with higher priority (placing those first in the compaction
>> >> queue)... Need to actually check the code.
>> >>
>> >> -- Lars
>> >>      From: Qiang Tian <tianq01@gmail.com>
>> >>
>> >>
>> >> To: "dev@hbase.apache.org" <dev@hbase.apache.org>
>> >> Sent: Thursday, December 4, 2014 7:26 PM
>> >> Subject: Re: Region is out of bounds
>> >>
>> >> ----My attempt to add reference files forcefully to compaction list in
>> >> Store.requetsCompaction() when region exceeds recommended maximum size
>> did
>> >> not work out well - some weird results in our test cases (but HBase
>> tests
>> >> are OK: small, medium and large).
>> >>
>> >> interesting...perhaps it was filtered out in
>> RatioBasedCompactionPolicy#
>> >> selectCompaction?
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Dec 5, 2014 at 5:20 AM, Andrew Purtell <apurtell@apache.org>
>> >> wrote:
>> >>
>> >>> Most versions of 0.98 since 0.98.1, but I haven't run a punishing high
>> >>> scale bulk ingest for its own sake, high-ish rate ingest and a
>> setting of
>> >>> blockingStoreFiles to 200 have been in service of getting data in for
>> >>> subsequent testing.
>> >>>
>> >>>
>> >>> On Thu, Dec 4, 2014 at 12:43 PM, Vladimir Rodionov <
>> >> vladrodionov@gmail.com
>> >>>>
>> >>> wrote:
>> >>>
>> >>>> Andrew,
>> >>>>
>> >>>> What HBase version have you run your test on?
>> >>>>
>> >>>> This issue probably does not exist anymore in a latest Apache
>> releases,
>> >>> but
>> >>>> still exists in not so latest, but still actively used, versions
of
>> >> CDH,
>> >>>> HDP etc. We have discovered it during large data set loading ( 100s
>> of
>> >>> GB)
>> >>>> in our cluster (4 nodes).
>> >>>>
>> >>>> -Vladimir
>> >>>>
>> >>>> On Thu, Dec 4, 2014 at 10:23 AM, Andrew Purtell <apurtell@apache.org
>> >
>> >>>> wrote:
>> >>>>
>> >>>>> Actually I have set hbase.hstore.
>> >>>> ​​
>> >>>> blockingStoreFiles to 200 in testing
>> >>>>> exactly :-), but must not have generated sufficient load to
>> encounter
>> >>> the
>> >>>>> issue you are seeing. Maybe it would be possible to adapt one
of the
>> >>>> ingest
>> >>>>> integration tests to trigger this problem? Set blockingStoreFiles
to
>> >>> 200
>> >>>> or
>> >>>>> more. Tune down the region size to 128K or similar. If
>> >>>>> it's reproducible like that please open a JIRA.
>> >>>>>
>> >>>>> On Wed, Dec 3, 2014 at 9:07 AM, Vladimir Rodionov <
>> >>>> vladrodionov@gmail.com>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Kevin,
>> >>>>>>
>> >>>>>> Thank you for your response. This is not a question on how
to
>> >>> configure
>> >>>>>> correctly HBase cluster for write heavy workloads. This
is internal
>> >>>> HBase
>> >>>>>> issue - something is wrong in a default logic of compaction
>> >> selection
>> >>>>>> algorithm in 0.94-0.98. It seems that nobody has ever tested
>> >>> importing
>> >>>>> data
>> >>>>>> with very high hbase.hstore.blockingStoreFiles value (200
in our
>> >>> case).
>> >>>>>>
>> >>>>>> -Vladimir Rodionov
>> >>>>>>
>> >>>>>> On Wed, Dec 3, 2014 at 6:38 AM, Kevin O'dell <
>> >>> kevin.odell@cloudera.com
>> >>>>>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> Vladimir,
>> >>>>>>>
>> >>>>>>> I know you said, "do not ask me why", but I am going
to have to
>> >>> ask
>> >>>>> you
>> >>>>>>> why.  The fact you are doing this(this being blocking
store
>> >> files >
>> >>>>> 200)
>> >>>>>>> tells me there is something or multiple somethings wrong
with
>> >> your
>> >>>>>> cluster
>> >>>>>>> setup.  A couple things come to mind:
>> >>>>>>>
>> >>>>>>> * During this heavy write period, could we use bulk
loads?  If
>> >> so,
>> >>>> this
>> >>>>>>> should solve almost all of your problems
>> >>>>>>>
>> >>>>>>> * 1GB region size is WAY too small, and if you are pushing
the
>> >>> volume
>> >>>>> of
>> >>>>>>> data you are talking about I would recommend 10 - 20GB
region
>> >> sizes
>> >>>>> this
>> >>>>>>> should help keep your region count smaller as well which
will
>> >>> result
>> >>>> in
>> >>>>>>> more optimal writes
>> >>>>>>>
>> >>>>>>> * Your cluster may be undersized, if you are setting
the blocking
>> >>> to
>> >>>> be
>> >>>>>>> that high, you may be pushing too much data for your
cluster
>> >>> overall.
>> >>>>>>>
>> >>>>>>> Would you be so kind as to pass me a few pieces of information?
>> >>>>>>>
>> >>>>>>> 1.) Cluster size
>> >>>>>>> 2.) Average region count per RS
>> >>>>>>> 3.) Heap size, Memstore global settings, and block cache
settings
>> >>>>>>> 4.) a RS log to pastebin and a time frame of "high writes"
>> >>>>>>>
>> >>>>>>> I can probably make some solid suggestions for you based
on the
>> >>> above
>> >>>>>> data.
>> >>>>>>>
>> >>>>>>> On Wed, Dec 3, 2014 at 1:04 AM, Vladimir Rodionov <
>> >>>>>> vladrodionov@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> This is what we observed in our environment(s)
>> >>>>>>>>
>> >>>>>>>> The issue exists in CDH4.5, 5.1, HDP2.1, Mapr4
>> >>>>>>>>
>> >>>>>>>> If some one sets # of blocking stores way above
default value,
>> >>> say
>> >>>> -
>> >>>>>> 200
>> >>>>>>> to
>> >>>>>>>> avoid write stalls during intensive data loading
(do not ask
>> >> me ,
>> >>>> why
>> >>>>>> we
>> >>>>>>> do
>> >>>>>>>> this), then
>> >>>>>>>> one of the regions grows indefinitely and takes
more 99% of
>> >>> overall
>> >>>>>>> table.
>> >>>>>>>>
>> >>>>>>>> It can't be split because it still has orphaned
reference
>> >> files.
>> >>>> Some
>> >>>>>> of
>> >>>>>>> a
>> >>>>>>>> reference files are able to avoid compactions for
a long time,
>> >>>>>> obviously.
>> >>>>>>>>
>> >>>>>>>> The split policy is IncreasingToUpperBound, max
region size is
>> >>> 1G.
>> >>>> I
>> >>>>> do
>> >>>>>>> my
>> >>>>>>>> tests on CDH4.5 mostly but all other distros seem
have the same
>> >>>>> issue.
>> >>>>>>>>
>> >>>>>>>> My attempt to add reference files forcefully to
compaction list
>> >>> in
>> >>>>>>>> Store.requetsCompaction() when region exceeds recommended
>> >> maximum
>> >>>>> size
>> >>>>>>> did
>> >>>>>>>> not work out well - some weird results in our test
cases (but
>> >>> HBase
>> >>>>>> tests
>> >>>>>>>> are OK: small, medium and large).
>> >>>>>>>>
>> >>>>>>>> What is so special with these reference files? Any
ideas, what
>> >>> can
>> >>>> be
>> >>>>>>> done
>> >>>>>>>> here to fix the issue?
>> >>>>>>>>
>> >>>>>>>> -Vladimir Rodionov
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Kevin O'Dell
>> >>>>>>> Systems Engineer, Cloudera
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Best regards,
>> >>>>>
>> >>>>>   - Andy
>> >>>>>
>> >>>>> Problems worthy of attack prove their worth by hitting back.
- Piet
>> >>> Hein
>> >>>>> (via Tom White)
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Best regards,
>> >>>
>> >>>   - Andy
>> >>>
>> >>> Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein
>> >>> (via Tom White)
>> >>>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Best regards,
>> >
>> >   - Andy
>> >
>> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> > (via Tom White)
>>
>>
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message