hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Schaepe <t...@timoschaepe.de>
Subject Re: Problems with hbase.hregion.max.filesize
Date Fri, 20 Dec 2013 02:48:14 GMT
Hey Enis,

thanks for the hint. I checked the logs and all flushes just before the splitting were successfull.
Also all compactions works fine.

I made another interesting notice. When I disable a table and than enable it again, HBase
starts to split the big regions automatically.

bye,

	Timo


Am 19.12.2013 um 18:33 schrieb Enis Söztutar <enis@apache.org>:

> If the split takes too long (longer than 30 secs), I would say you may have
> too many store files in the region. Split has to write two tiny files per
> store file. The other thing may be the region has to be closed before
> split. Thus it has to do a flush. If it cannot complete the flush in time,
> it might cancel the split as well. Did you check that? Does your
> compactions working as intended?
> 
> Enis
> 
> 
> On Wed, Dec 18, 2013 at 10:06 AM, Timo Schaepe <timo@timoschaepe.de> wrote:
> 
>> @Ted Yu:
>> Yep, nevertheless thanks a lot!
>> 
>> 
>> Am 18.12.2013 um 10:03 schrieb Ted Yu <yuzhihong@gmail.com>:
>> 
>>> Timo:
>>> I went through namenode log and didn't find much clue.
>>> 
>>> Cheers
>>> 
>>> 
>>> On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <timo@timoschaepe.de>
>> wrote:
>>> 
>>>> Hey Ted Yu,
>>>> 
>>>> I had digging the name node log and so far I've found nothing special.
>> No
>>>> Exception, FATAL or ERROR message nor anything other peculiarities.
>>>> Only I see a lot of messages like this:
>>>> 
>>>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange:
>> Removing
>>>> lease on
>>>> 
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>>>> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>>>> ,60020,1386712527761_1295065721_26
>>>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>>>> completeFile:
>>>> 
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>>>> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>>>> ,60020,1386712527761_1295065721_26
>>>> 
>>>> But maybe that is normal. If you wanna have a look, you can find the log
>>>> snippet at
>>>> 
>> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
>>>> 
>>>> Thanks,
>>>> 
>>>>       Timo
>>>> 
>>>> 
>>>> 
>>>> Am 14.12.2013 um 09:12 schrieb Ted Yu <yuzhihong@gmail.com>:
>>>> 
>>>>> Timo:
>>>>> Other than two occurrences of 'Took too long to split the files'
>>>>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
>>>> log.
>>>>> 
>>>>> If you have time, mind checking namenode log for 1 minute interval
>>>> leading
>>>>> up to 13:54:20,194 and 13:55:10,533, respectively ?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <timo@timoschaepe.de>
>>>> wrote:
>>>>> 
>>>>>> Hey,
>>>>>> 
>>>>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout.
At
>>>> the
>>>>>> moment (the import is actually working) and after I splittet the
>>>> specific
>>>>>> regions manually, we do not have growing regions anymore.
>>>>>> 
>>>>>> hbase hbck says, all things are going fine.
>>>>>> 0 inconsistencies detected.
>>>>>> Status: OK
>>>>>> 
>>>>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>>>>>> The relevant tablename ist data_1091.
>>>>>> 
>>>>>> Thanks for your time.
>>>>>> 
>>>>>>      Timo
>>>>>> 
>>>>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yuzhihong@gmail.com>:
>>>>>> 
>>>>>>> Timo:
>>>>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20
so that
>> we
>>>>>> can
>>>>>>> see what happened ?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>>>>>> jean-marc@spaggiari.org> wrote:
>>>>>>> 
>>>>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put
it back
>> to
>>>>>> its
>>>>>>>> default value after.
>>>>>>>> 
>>>>>>>> Default value is 30 seconds. I think it's not normal for
a split to
>>>> take
>>>>>>>> more than that.
>>>>>>>> 
>>>>>>>> What is your hardware configuration?
>>>>>>>> 
>>>>>>>> Have you run hbck to see if everything is correct?
>>>>>>>> 
>>>>>>>> JM
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2013/12/13 Timo Schaepe <timo@timoschaepe.de>
>>>>>>>> 
>>>>>>>>> Hello again,
>>>>>>>>> 
>>>>>>>>> digging in the logs of the specific regionserver shows
me that:
>>>>>>>>> 
>>>>>>>>> 2013-12-12 13:54:20,194 INFO
>>>>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>>>>>> rollback/cleanup
>>>>>>>>> of failed split of
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>>>>>> Took too long to split the files and create the references,
>> aborting
>>>>>>>> split
>>>>>>>>> 
>>>>>>>>> This message appears two time, so it seems, that HBase
tried to
>> split
>>>>>> the
>>>>>>>>> region but it failed. I don't know why. How is the behaviour
of
>>>> HBase,
>>>>>>>> if a
>>>>>>>>> region split fails? Are there more tries to split this
region
>> again?
>>>> I
>>>>>>>>> didn't find any new tries in the log. Now I split the
big regions
>>>>>>>> manually
>>>>>>>>> and this works. And also it seems, that HBase split the
new regions
>>>>>> again
>>>>>>>>> to crunch they down to the given limit.
>>>>>>>>> 
>>>>>>>>> But also it is a mystery for me, why the split size in
Hannibal
>> shows
>>>>>> me
>>>>>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>>     Timo
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <timo@timoschaepe.de>:
>>>>>>>>> 
>>>>>>>>>> Hello,
>>>>>>>>>> 
>>>>>>>>>> during the loading of data in our cluster I noticed
some strange
>>>>>>>>> behavior of some regions, that I don't understand.
>>>>>>>>>> 
>>>>>>>>>> Scenario:
>>>>>>>>>> We convert data from a mysql database to HBase. The
data is
>> inserted
>>>>>>>>> with a put to the specific HBase table. The row key is
a
>> timestamp. I
>>>>>>>> know
>>>>>>>>> the problem with timestamp keys, but in our requirement
it works
>>>> quiet
>>>>>>>>> well. The problem is now, that there are some regions,
which are
>>>>>> growing
>>>>>>>>> and growing.
>>>>>>>>>> 
>>>>>>>>>> For example the table on the picture [1]. First,
all data was
>>>>>>>>> distributed over regions and node. And now, the data
is written
>> into
>>>>>> only
>>>>>>>>> one region, which is growing and I can see no splitting
at all.
>>>>>> Actually
>>>>>>>>> the size of the big region is nearly 60 GB.
>>>>>>>>>> 
>>>>>>>>>> HBase version is 0.94.11. I cannot understand, why
the splitting
>> is
>>>>>> not
>>>>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
>>>> to
>>>>>> 2
>>>>>>>> GB
>>>>>>>>> and HBase accepted this value.
>>>>>>>>>> 
>>>>>>>>>> <property>
>>>>>>>>>>   <!--Loaded from hbase-site.xml-->
>>>>>>>>>>   <name>hbase.hregion.max.filesize</name>
>>>>>>>>>>   <value>2147483648</value>
>>>>>>>>>> </property>
>>>>>>>>>> 
>>>>>>>>>> First mystery: Hannibal shows me the split size is
10 GB (see
>>>>>>>>> screenshot).
>>>>>>>>>> Second mystery: HBase is not splitting some regions
neither at 2
>> GB
>>>>>> nor
>>>>>>>>> 10 GB.
>>>>>>>>>> 
>>>>>>>>>> Any ideas? Could be the timestamp rowkey cause this
problem?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>>   Timo
>>>>>>>>>> 
>>>>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message