hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yuzhih...@gmail.com
Subject Re: Major Compaction Concerns
Date Sun, 08 Jan 2012 17:54:02 GMT
About 2b, have you tried getting the major compaction setting from column descriptor ?

For 2a, what you requested would result in new methods of HBaseConfiguration class to be added.
Currently the configuration on client class path would be used. 


On Jan 8, 2012, at 9:28 AM, Mikael Sitruk <mikael.sitruk@gmail.com> wrote:

> Ted hi
> First thanks for answering, regarding the JIRA i will fill them
> Second, it seems that i did not explain myself correctly regarding 2.a. -
> As you i do not expect that a configuration set on my client will be
> propagated to the cluster, but i do expect that if i set a configuration on
> a server then doing connection.getConfiguration() from a client i will get
> teh configuration from the cluster.
> Currently the configuration returned is from the client config.
> So the problem is that you have no way to check the configuration of a
> cluster.
> I would expect to have some API to return the cluster config and even
> getting a map <serverInfo, config> so it can be easy to check cluster
> problem using code.
> 2.b. I know this code, and i tried to validate it. I set in the server
> config the "hbase.hregion.majorcompaction" to "0", then start the server
> (cluster). Since from the UI or from JMX this parameter is not visible at
> the cluster level, I try to get the value from the client (to see that the
> cluster is using it)
> *HTableDescriptor hTableDescriptor =
> conn.getHTableDescriptor(Bytes.toBytes("my table"));*
> *hTableDescriptor.getValue("hbase.hregion.majorcompaction")*
> but i still got 24h (and not the value set in the config)! that was my
> problem from the beginning! ==> Using the config (on the server side) will
> not propagate into the table/column family
> Mikael.S
> On Sun, Jan 8, 2012 at 7:13 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> I am not expert in major compaction feature.
>> Let me try to answer questions in #2.
>> 2.a
>>> If I set the property via the configuration shouldn’t all the cluster be
>>> aware of?
>> There're multiple clients connecting to one cluster. I wouldn't expect
>> values in the configuration (m_hbConfig) to propagate onto the cluster.
>> 2.b
>> Store.getNextMajorCompactTime() shows that "hbase.hregion.majorcompaction"
>> can be specified per column family:
>> long getNextMajorCompactTime() {
>>   // default = 24hrs
>>   long ret = conf.getLong(HConstants.MAJOR_COMPACTION_PERIOD,
>> 1000*60*60*24);
>>   if (family.getValue(HConstants.MAJOR_COMPACTION_PERIOD) != null) {
>> 2.d
>>> d. I tried also to setup the parameter via hbase shell but setting such
>>> properties is not supported. (do you plan to add such support via the
>>> shell?)
>> This is a good idea. Please open a JIRA.
>> For #5, HBASE-3965 is an improvement and doesn't have a patch yet.
>> Allow me to quote Alan Kay: 'The best way to predict the future is to
>> invent it.'
>> Once we have a patch, we can always backport it to 0.92 after some people
>> have verified the improvement.
>>> 6.       In case a compaction (major) is running it seems there is no way
>>> to stop-it. Do you plan to add such feature?
>> Again, logging a JIRA would provide a good starting point for discussion.
>> Thanks for the verification work and suggestions, Mikael.
>> On Sun, Jan 8, 2012 at 7:27 AM, Mikael Sitruk <mikael.sitruk@gmail.com
>>> wrote:
>>> I forgot to mention, I'm using HBase 0.90.1
>>> Regards,
>>> Mikael.S
>>> On Sun, Jan 8, 2012 at 5:25 PM, Mikael Sitruk <mikael.sitruk@gmail.com
>>>> wrote:
>>>> Hi
>>>> I have some concern regarding major compactions below...
>>>>   1. According to best practices from the mailing list and from the
>>>>   book, automatic major compaction should be disabled. This can be
>> done
>>> by
>>>>   setting the property ‘hbase.hregion.majorcompaction’ to ‘0’.
>>> Neverhteless
>>>>   even after having doing this I STILL see “major compaction” messages
>>> in
>>>>   logs. therefore it is unclear how can I manage major compactions.
>> (The
>>>>   system has heavy insert - uniformly on the cluster, and major
>>> compaction
>>>>   affect the performance of the system).
>>>>   If I'm not wrong it seems from the code that: even if not requested
>>>>   and even if the indicator is set to '0' (no automatic major
>>> compaction),
>>>>   major compaction can be triggered by the code in case all store
>> files
>>> are
>>>>   candidate for a compaction (from Store.compact(final boolean
>>> forceMajor)).
>>>>   Shouldn't the code add a condition that automatic major compaction
>> is
>>>>   disabled??
>>>>   2. I tried to check the parameter  ‘hbase.hregion.majorcompaction’
>> at
>>>>   runtime using several approaches - to validate that the server
>> indeed
>>>>   loaded the parameter.
>>>> a. Using a connection created from local config
>>>> *conn = (HConnection) HConnectionManager.getConnection(m_hbConfig);*
>>>> *conn.getConfiguration().getString(“hbase.hregion.majorcompaction”)*
>>>> returns the parameter from local config and not from cluster. Is it a
>>> bug?
>>>> If I set the property via the configuration shouldn’t all the cluster
>> be
>>>> aware of? (supposing that the connection indeed connected to the
>> cluster)
>>>> b.  fetching the property from the table descriptor
>>>> *HTableDescriptor hTableDescriptor =
>>>> conn.getHTableDescriptor(Bytes.toBytes("my table"));*
>>>> *hTableDescriptor.getValue("hbase.hregion.majorcompaction")*
>>>> This will returns the default parameter value (1 day) not the parameter
>>>> from the configuration (on the cluster). It seems to be a bug, isn’t
>> it?
>>>> (the parameter from the config, should be the default if not set at the
>>>> table level)
>>>> c. The only way I could set the parameter to 0 and really see it is via
>>>> the Admin API, updating the table descriptor or the column descriptor.
>>> Now
>>>> I could see the parameter on the web UI. So is it the only way to set
>>>> correctly the parameter? If setting the parameter via the configuration
>>>> file, shouldn’t the webUI show this on any table created?
>>>> d. I tried also to setup the parameter via hbase shell but setting such
>>>> properties is not supported. (do you plan to add such support via the
>>>> shell?)
>>>> e. Generally is it possible to get via API the configuration used by
>> the
>>>> servers? (at cluster/server level)
>>>>    3.  I ran both major compaction  requests from the shell or from
>> API
>>>> but since both are async there is no progress indication. Neither the
>> JMX
>>>> nor the Web will help here since you don’t know if a compaction task is
>>>> running. Tailling the logs is not an efficient way to do this neither.
>>> The
>>>> point is that I would like to automate the process and avoid compaction
>>>> storm. So I want to do that region, region, but if I don’t know when a
>>>> compaction started/ended I can’t automate it.
>>>> 4.       In case there is no compaction files in queue (but still you
>>> have
>>>> more than 1 storefile per store e.g. minor compaction just finished)
>> then
>>>> invoking major_compact will indeed decrease the number of store files,
>>> but
>>>> the compaction queue will remain to 0 during the compaction task
>>> (shouldn’t
>>>> the compaction queue increase by the number of file to compact and be
>>>> reduced when the task ended?)
>>>> 5.       I saw already HBASE-3965 for getting status of major
>> compaction,
>>>> nevertheless it has be removed from 0.92, is it possible to put it
>> back?
>>>> Even sooner than 0.92?
>>>> 6.       In case a compaction (major) is running it seems there is no
>> way
>>>> to stop-it. Do you plan to add such feature?
>>>> 7.       Do you plan to add functionality via JMX (starting/stopping
>>>> compaction, splitting....)
>>>> 8.       Finally there were some request for allowing custom
>> compaction,
>>>> part of this was given via the RegionObserver in HBASE-2001,
>> nevertheless
>>>> do you consider adding support for custom compaction (providing real
>>>> pluggable compaction stategy not just observer)?
>>>> Regards,
>>>> Mikael.S
>>> --
>>> Mikael.S
> -- 
> Mikael.S

View raw message