hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Schaepe <t...@timoschaepe.de>
Subject Re: Problems with hbase.hregion.max.filesize
Date Sat, 14 Dec 2013 13:21:16 GMT
Hey,

@JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import
is actually working) and after I splittet the specific regions manually, we do not have growing
regions anymore.

hbase hbck says, all things are going fine.
0 inconsistencies detected.
Status: OK

@Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
The relevant tablename ist data_1091.

Thanks for your time.

	Timo

Am 13.12.2013 um 20:18 schrieb Ted Yu <yuzhihong@gmail.com>:

> Timo:
> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
> see what happened ?
> 
> Thanks
> 
> 
> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
> 
>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
>> default value after.
>> 
>> Default value is 30 seconds. I think it's not normal for a split to take
>> more than that.
>> 
>> What is your hardware configuration?
>> 
>> Have you run hbck to see if everything is correct?
>> 
>> JM
>> 
>> 
>> 2013/12/13 Timo Schaepe <timo@timoschaepe.de>
>> 
>>> Hello again,
>>> 
>>> digging in the logs of the specific regionserver shows me that:
>>> 
>>> 2013-12-12 13:54:20,194 INFO
>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>> rollback/cleanup
>>> of failed split of
>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>> Took too long to split the files and create the references, aborting
>> split
>>> 
>>> This message appears two time, so it seems, that HBase tried to split the
>>> region but it failed. I don't know why. How is the behaviour of HBase,
>> if a
>>> region split fails? Are there more tries to split this region again? I
>>> didn't find any new tries in the log. Now I split the big regions
>> manually
>>> and this works. And also it seems, that HBase split the new regions again
>>> to crunch they down to the given limit.
>>> 
>>> But also it is a mystery for me, why the split size in Hannibal shows me
>>> 10 GB and in base-site.xml I put 2 GB…
>>> 
>>> Thanks,
>>> 
>>>        Timo
>>> 
>>> 
>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <timo@timoschaepe.de>:
>>> 
>>>> Hello,
>>>> 
>>>> during the loading of data in our cluster I noticed some strange
>>> behavior of some regions, that I don't understand.
>>>> 
>>>> Scenario:
>>>> We convert data from a mysql database to HBase. The data is inserted
>>> with a put to the specific HBase table. The row key is a timestamp. I
>> know
>>> the problem with timestamp keys, but in our requirement it works quiet
>>> well. The problem is now, that there are some regions, which are growing
>>> and growing.
>>>> 
>>>> For example the table on the picture [1]. First, all data was
>>> distributed over regions and node. And now, the data is written into only
>>> one region, which is growing and I can see no splitting at all. Actually
>>> the size of the big region is nearly 60 GB.
>>>> 
>>>> HBase version is 0.94.11. I cannot understand, why the splitting is not
>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
>> GB
>>> and HBase accepted this value.
>>>> 
>>>> <property>
>>>>      <!--Loaded from hbase-site.xml-->
>>>>      <name>hbase.hregion.max.filesize</name>
>>>>      <value>2147483648</value>
>>>> </property>
>>>> 
>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>> screenshot).
>>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor
>>> 10 GB.
>>>> 
>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>> 
>>>> Thanks,
>>>> 
>>>>      Timo
>>>> 
>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>> 
>>> 
>> 


Mime
View raw message