hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <e...@apache.org>
Subject Re: Problems with hbase.hregion.max.filesize
Date Fri, 20 Dec 2013 02:33:40 GMT
If the split takes too long (longer than 30 secs), I would say you may have
too many store files in the region. Split has to write two tiny files per
store file. The other thing may be the region has to be closed before
split. Thus it has to do a flush. If it cannot complete the flush in time,
it might cancel the split as well. Did you check that? Does your
compactions working as intended?

Enis


On Wed, Dec 18, 2013 at 10:06 AM, Timo Schaepe <timo@timoschaepe.de> wrote:

> @Ted Yu:
> Yep, nevertheless thanks a lot!
>
>
> Am 18.12.2013 um 10:03 schrieb Ted Yu <yuzhihong@gmail.com>:
>
> > Timo:
> > I went through namenode log and didn't find much clue.
> >
> > Cheers
> >
> >
> > On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <timo@timoschaepe.de>
> wrote:
> >
> >> Hey Ted Yu,
> >>
> >> I had digging the name node log and so far I've found nothing special.
> No
> >> Exception, FATAL or ERROR message nor anything other peculiarities.
> >> Only I see a lot of messages like this:
> >>
> >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange:
> Removing
> >> lease on
> >>
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> >> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> >> ,60020,1386712527761_1295065721_26
> >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> >> completeFile:
> >>
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> >> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> >> ,60020,1386712527761_1295065721_26
> >>
> >> But maybe that is normal. If you wanna have a look, you can find the log
> >> snippet at
> >>
> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
> >>
> >> Thanks,
> >>
> >>        Timo
> >>
> >>
> >>
> >> Am 14.12.2013 um 09:12 schrieb Ted Yu <yuzhihong@gmail.com>:
> >>
> >>> Timo:
> >>> Other than two occurrences of 'Took too long to split the files'
> >>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
> >> log.
> >>>
> >>> If you have time, mind checking namenode log for 1 minute interval
> >> leading
> >>> up to 13:54:20,194 and 13:55:10,533, respectively ?
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <timo@timoschaepe.de>
> >> wrote:
> >>>
> >>>> Hey,
> >>>>
> >>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
> >> the
> >>>> moment (the import is actually working) and after I splittet the
> >> specific
> >>>> regions manually, we do not have growing regions anymore.
> >>>>
> >>>> hbase hbck says, all things are going fine.
> >>>> 0 inconsistencies detected.
> >>>> Status: OK
> >>>>
> >>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> >>>> The relevant tablename ist data_1091.
> >>>>
> >>>> Thanks for your time.
> >>>>
> >>>>       Timo
> >>>>
> >>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yuzhihong@gmail.com>:
> >>>>
> >>>>> Timo:
> >>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so
that
> we
> >>>> can
> >>>>> see what happened ?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>
> >>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> >>>>> jean-marc@spaggiari.org> wrote:
> >>>>>
> >>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put
it back
> to
> >>>> its
> >>>>>> default value after.
> >>>>>>
> >>>>>> Default value is 30 seconds. I think it's not normal for a split
to
> >> take
> >>>>>> more than that.
> >>>>>>
> >>>>>> What is your hardware configuration?
> >>>>>>
> >>>>>> Have you run hbck to see if everything is correct?
> >>>>>>
> >>>>>> JM
> >>>>>>
> >>>>>>
> >>>>>> 2013/12/13 Timo Schaepe <timo@timoschaepe.de>
> >>>>>>
> >>>>>>> Hello again,
> >>>>>>>
> >>>>>>> digging in the logs of the specific regionserver shows me
that:
> >>>>>>>
> >>>>>>> 2013-12-12 13:54:20,194 INFO
> >>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >>>>>> rollback/cleanup
> >>>>>>> of failed split of
> >>>>>>>
> >>>>>>
> >>>>
> >>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>>>>>> Took too long to split the files and create the references,
> aborting
> >>>>>> split
> >>>>>>>
> >>>>>>> This message appears two time, so it seems, that HBase tried
to
> split
> >>>> the
> >>>>>>> region but it failed. I don't know why. How is the behaviour
of
> >> HBase,
> >>>>>> if a
> >>>>>>> region split fails? Are there more tries to split this region
> again?
> >> I
> >>>>>>> didn't find any new tries in the log. Now I split the big
regions
> >>>>>> manually
> >>>>>>> and this works. And also it seems, that HBase split the
new regions
> >>>> again
> >>>>>>> to crunch they down to the given limit.
> >>>>>>>
> >>>>>>> But also it is a mystery for me, why the split size in Hannibal
> shows
> >>>> me
> >>>>>>> 10 GB and in base-site.xml I put 2 GB…
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>      Timo
> >>>>>>>
> >>>>>>>
> >>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <timo@timoschaepe.de>:
> >>>>>>>
> >>>>>>>> Hello,
> >>>>>>>>
> >>>>>>>> during the loading of data in our cluster I noticed
some strange
> >>>>>>> behavior of some regions, that I don't understand.
> >>>>>>>>
> >>>>>>>> Scenario:
> >>>>>>>> We convert data from a mysql database to HBase. The
data is
> inserted
> >>>>>>> with a put to the specific HBase table. The row key is a
> timestamp. I
> >>>>>> know
> >>>>>>> the problem with timestamp keys, but in our requirement
it works
> >> quiet
> >>>>>>> well. The problem is now, that there are some regions, which
are
> >>>> growing
> >>>>>>> and growing.
> >>>>>>>>
> >>>>>>>> For example the table on the picture [1]. First, all
data was
> >>>>>>> distributed over regions and node. And now, the data is
written
> into
> >>>> only
> >>>>>>> one region, which is growing and I can see no splitting
at all.
> >>>> Actually
> >>>>>>> the size of the big region is nearly 60 GB.
> >>>>>>>>
> >>>>>>>> HBase version is 0.94.11. I cannot understand, why the
splitting
> is
> >>>> not
> >>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
> >> to
> >>>> 2
> >>>>>> GB
> >>>>>>> and HBase accepted this value.
> >>>>>>>>
> >>>>>>>> <property>
> >>>>>>>>    <!--Loaded from hbase-site.xml-->
> >>>>>>>>    <name>hbase.hregion.max.filesize</name>
> >>>>>>>>    <value>2147483648</value>
> >>>>>>>> </property>
> >>>>>>>>
> >>>>>>>> First mystery: Hannibal shows me the split size is 10
GB (see
> >>>>>>> screenshot).
> >>>>>>>> Second mystery: HBase is not splitting some regions
neither at 2
> GB
> >>>> nor
> >>>>>>> 10 GB.
> >>>>>>>>
> >>>>>>>> Any ideas? Could be the timestamp rowkey cause this
problem?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>    Timo
> >>>>>>>>
> >>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message