hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: [Help] minor compact is continuously consuming the disk space until run out of space?
Date Sat, 26 Aug 2017 14:17:27 GMT
Even if you disable minor compaction during bulk load, wouldn't subsequent
compaction(s) run into the same problem ?

Please take a look at the 3rd paragraph under
http://hbase.apache.org/book.html#compaction .

You can also read
http://hbase.apache.org/book.html#compaction.file.selection.old to see how
different parameters are used for file selection.

By "controling the number of hfiles" I mean reducing the amount of data for
each bulk load.

If the regions for this table are not evenly distributed, some region
server(s) may receive more data than the other servers.

Cheers

On Sat, Aug 26, 2017 at 7:03 AM, Liu, Ming (Ming) <ming.liu@esgyn.cn> wrote:

> Thanks Ted,
>
> I don't know how to control the number of hfiles, need to check the
> importtsv tool. But is there anyway we can disable 'minor compaction' now?
> And why 'minor compaction' will increase the disk usage. The system is
> idle, there are no other workload, just after load data, and HBase start to
> do minor compact and we see disk space are smaller and smaller until
> running out.
> We think minor compact is just combining files , say, if we have 10 hfiles
> using 100G disk space, after minor compact, it still should be 100G, if not
> less. It is called compaction, isn't it? So we don't understand why it is
> using so many extra disk space? Anything wrong in our system?
>
> thanks,
> Ming
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Saturday, August 26, 2017 9:54 PM
> To: user@hbase.apache.org
> Subject: Re: [Help] minor compact is continuously consuming the disk space
> until run out of space?
>
> bq. on each Region Server there are about 800 hfiles
>
> Is it possible to control the number of hfiles during each bulk load ?
>
> For this big table, are the regions evenly spread across the servers ? If
> so, consider increasing the capacity of your cluster.
>
> From the doc for hbase.hstore.compactionThreshold :
>
> Larger values delay compaction, but when compaction does occur, it takes
> longer to complete.
>
>
> On Sat, Aug 26, 2017 at 6:48 AM, Liu, Ming (Ming) <ming.liu@esgyn.cn>
> wrote:
>
> > hi, all,
> >
> > We have a system with 17 nodes, with a big table about 28T in size. We
> use
> > native hbase bulkloader (importtsv) to load data, and it generated a lot
> of
> > hfiles, on each Region Server there are about 800 hfiles.  We turned off
> > Major Compact, but the Minor compaction is running due to so many hfile.
> > The problem is, after the initial loading, there are about 80% disk space
> > used, when minor compaction is going on, we notice the disk space is
> > reducing rapidly until all disk spaces are used and hbase went down.
> >
> > We try to change the hbase.hstore.compactionThreshold to 2000, but the
> > minor compaction is still triggered.
> >
> > The system is CDH 5.7, HBase is 1.2.
> >
> > Could anyone help to give us some suggestions? We are really stuck.
> Thanks
> > in advance.
> >
> > Thanks,
> > Ming
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message