hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Sitruk <mikael.sit...@gmail.com>
Subject Major Compaction Concerns
Date Sun, 08 Jan 2012 15:25:02 GMT

I have some concern regarding major compactions below...

   1. According to best practices from the mailing list and from the book,
   automatic major compaction should be disabled. This can be done by setting
   the property ‘hbase.hregion.majorcompaction’ to ‘0’. Neverhteless even
   after having doing this I STILL see “major compaction” messages in logs.
   therefore it is unclear how can I manage major compactions. (The system has
   heavy insert - uniformly on the cluster, and major compaction affect the
   performance of the system).
   If I'm not wrong it seems from the code that: even if not requested and
   even if the indicator is set to '0' (no automatic major compaction), major
   compaction can be triggered by the code in case all store files are
   candidate for a compaction (from Store.compact(final boolean forceMajor)).
   Shouldn't the code add a condition that automatic major compaction is

   2. I tried to check the parameter  ‘hbase.hregion.majorcompaction’  at
   runtime using several approaches - to validate that the server indeed
   loaded the parameter.

a. Using a connection created from local config

*conn = (HConnection) HConnectionManager.getConnection(m_hbConfig);*


returns the parameter from local config and not from cluster. Is it a bug?
If I set the property via the configuration shouldn’t all the cluster be
aware of? (supposing that the connection indeed connected to the cluster)

b.  fetching the property from the table descriptor

*HTableDescriptor hTableDescriptor =
conn.getHTableDescriptor(Bytes.toBytes("my table"));*


This will returns the default parameter value (1 day) not the parameter
from the configuration (on the cluster). It seems to be a bug, isn’t it?
(the parameter from the config, should be the default if not set at the
table level)

c. The only way I could set the parameter to 0 and really see it is via the
Admin API, updating the table descriptor or the column descriptor. Now I
could see the parameter on the web UI. So is it the only way to set
correctly the parameter? If setting the parameter via the configuration
file, shouldn’t the webUI show this on any table created?

d. I tried also to setup the parameter via hbase shell but setting such
properties is not supported. (do you plan to add such support via the

e. Generally is it possible to get via API the configuration used by the
servers? (at cluster/server level)

    3.  I ran both major compaction  requests from the shell or from API
but since both are async there is no progress indication. Neither the JMX
nor the Web will help here since you don’t know if a compaction task is
running. Tailling the logs is not an efficient way to do this neither. The
point is that I would like to automate the process and avoid compaction
storm. So I want to do that region, region, but if I don’t know when a
compaction started/ended I can’t automate it.

4.       In case there is no compaction files in queue (but still you have
more than 1 storefile per store e.g. minor compaction just finished) then
invoking major_compact will indeed decrease the number of store files, but
the compaction queue will remain to 0 during the compaction task (shouldn’t
the compaction queue increase by the number of file to compact and be
reduced when the task ended?)

5.       I saw already HBASE-3965 for getting status of major compaction,
nevertheless it has be removed from 0.92, is it possible to put it back?
Even sooner than 0.92?

6.       In case a compaction (major) is running it seems there is no way
to stop-it. Do you plan to add such feature?

7.       Do you plan to add functionality via JMX (starting/stopping
compaction, splitting....)

8.       Finally there were some request for allowing custom compaction,
part of this was given via the RegionObserver in HBASE-2001, nevertheless
do you consider adding support for custom compaction (providing real
pluggable compaction stategy not just observer)?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message