hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Sitruk <mikael.sit...@gmail.com>
Subject Re: Major Compaction Concerns
Date Sun, 08 Jan 2012 17:28:48 GMT
Ted hi
First thanks for answering, regarding the JIRA i will fill them
Second, it seems that i did not explain myself correctly regarding 2.a. -
As you i do not expect that a configuration set on my client will be
propagated to the cluster, but i do expect that if i set a configuration on
a server then doing connection.getConfiguration() from a client i will get
teh configuration from the cluster.
Currently the configuration returned is from the client config.
So the problem is that you have no way to check the configuration of a
cluster.
I would expect to have some API to return the cluster config and even
getting a map <serverInfo, config> so it can be easy to check cluster
problem using code.

2.b. I know this code, and i tried to validate it. I set in the server
config the "hbase.hregion.majorcompaction" to "0", then start the server
(cluster). Since from the UI or from JMX this parameter is not visible at
the cluster level, I try to get the value from the client (to see that the
cluster is using it)

*HTableDescriptor hTableDescriptor =
conn.getHTableDescriptor(Bytes.toBytes("my table"));*

*hTableDescriptor.getValue("hbase.hregion.majorcompaction")*
but i still got 24h (and not the value set in the config)! that was my
problem from the beginning! ==> Using the config (on the server side) will
not propagate into the table/column family

Mikael.S

On Sun, Jan 8, 2012 at 7:13 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> I am not expert in major compaction feature.
> Let me try to answer questions in #2.
>
> 2.a
> > If I set the property via the configuration shouldn’t all the cluster be
> > aware of?
>
> There're multiple clients connecting to one cluster. I wouldn't expect
> values in the configuration (m_hbConfig) to propagate onto the cluster.
>
> 2.b
> Store.getNextMajorCompactTime() shows that "hbase.hregion.majorcompaction"
> can be specified per column family:
>
>  long getNextMajorCompactTime() {
>    // default = 24hrs
>    long ret = conf.getLong(HConstants.MAJOR_COMPACTION_PERIOD,
> 1000*60*60*24);
>    if (family.getValue(HConstants.MAJOR_COMPACTION_PERIOD) != null) {
>
> 2.d
> > d. I tried also to setup the parameter via hbase shell but setting such
> > properties is not supported. (do you plan to add such support via the
> > shell?)
>
> This is a good idea. Please open a JIRA.
>
> For #5, HBASE-3965 is an improvement and doesn't have a patch yet.
>
> Allow me to quote Alan Kay: 'The best way to predict the future is to
> invent it.'
>
> Once we have a patch, we can always backport it to 0.92 after some people
> have verified the improvement.
>
> > 6.       In case a compaction (major) is running it seems there is no way
> > to stop-it. Do you plan to add such feature?
>
> Again, logging a JIRA would provide a good starting point for discussion.
>
> Thanks for the verification work and suggestions, Mikael.
>
> On Sun, Jan 8, 2012 at 7:27 AM, Mikael Sitruk <mikael.sitruk@gmail.com
> >wrote:
>
> > I forgot to mention, I'm using HBase 0.90.1
> >
> > Regards,
> > Mikael.S
> >
> > On Sun, Jan 8, 2012 at 5:25 PM, Mikael Sitruk <mikael.sitruk@gmail.com
> > >wrote:
> >
> > > Hi
> > >
> > >
> > >
> > > I have some concern regarding major compactions below...
> > >
> > >
> > >    1. According to best practices from the mailing list and from the
> > >    book, automatic major compaction should be disabled. This can be
> done
> > by
> > >    setting the property ‘hbase.hregion.majorcompaction’ to ‘0’.
> > Neverhteless
> > >    even after having doing this I STILL see “major compaction” messages
> > in
> > >    logs. therefore it is unclear how can I manage major compactions.
> (The
> > >    system has heavy insert - uniformly on the cluster, and major
> > compaction
> > >    affect the performance of the system).
> > >    If I'm not wrong it seems from the code that: even if not requested
> > >    and even if the indicator is set to '0' (no automatic major
> > compaction),
> > >    major compaction can be triggered by the code in case all store
> files
> > are
> > >    candidate for a compaction (from Store.compact(final boolean
> > forceMajor)).
> > >    Shouldn't the code add a condition that automatic major compaction
> is
> > >    disabled??
> > >
> > >    2. I tried to check the parameter  ‘hbase.hregion.majorcompaction’
>  at
> > >    runtime using several approaches - to validate that the server
> indeed
> > >    loaded the parameter.
> > >
> > > a. Using a connection created from local config
> > >
> > > *conn = (HConnection) HConnectionManager.getConnection(m_hbConfig);*
> > >
> > > *conn.getConfiguration().getString(“hbase.hregion.majorcompaction”)*
> > >
> > > returns the parameter from local config and not from cluster. Is it a
> > bug?
> > > If I set the property via the configuration shouldn’t all the cluster
> be
> > > aware of? (supposing that the connection indeed connected to the
> cluster)
> > >
> > > b.  fetching the property from the table descriptor
> > >
> > > *HTableDescriptor hTableDescriptor =
> > > conn.getHTableDescriptor(Bytes.toBytes("my table"));*
> > >
> > > *hTableDescriptor.getValue("hbase.hregion.majorcompaction")*
> > >
> > > This will returns the default parameter value (1 day) not the parameter
> > > from the configuration (on the cluster). It seems to be a bug, isn’t
> it?
> > > (the parameter from the config, should be the default if not set at the
> > > table level)
> > >
> > > c. The only way I could set the parameter to 0 and really see it is via
> > > the Admin API, updating the table descriptor or the column descriptor.
> > Now
> > > I could see the parameter on the web UI. So is it the only way to set
> > > correctly the parameter? If setting the parameter via the configuration
> > > file, shouldn’t the webUI show this on any table created?
> > >
> > > d. I tried also to setup the parameter via hbase shell but setting such
> > > properties is not supported. (do you plan to add such support via the
> > > shell?)
> > >
> > > e. Generally is it possible to get via API the configuration used by
> the
> > > servers? (at cluster/server level)
> > >
> > >     3.  I ran both major compaction  requests from the shell or from
> API
> > > but since both are async there is no progress indication. Neither the
> JMX
> > > nor the Web will help here since you don’t know if a compaction task is
> > > running. Tailling the logs is not an efficient way to do this neither.
> > The
> > > point is that I would like to automate the process and avoid compaction
> > > storm. So I want to do that region, region, but if I don’t know when a
> > > compaction started/ended I can’t automate it.
> > >
> > > 4.       In case there is no compaction files in queue (but still you
> > have
> > > more than 1 storefile per store e.g. minor compaction just finished)
> then
> > > invoking major_compact will indeed decrease the number of store files,
> > but
> > > the compaction queue will remain to 0 during the compaction task
> > (shouldn’t
> > > the compaction queue increase by the number of file to compact and be
> > > reduced when the task ended?)
> > >
> > >
> > > 5.       I saw already HBASE-3965 for getting status of major
> compaction,
> > > nevertheless it has be removed from 0.92, is it possible to put it
> back?
> > > Even sooner than 0.92?
> > >
> > > 6.       In case a compaction (major) is running it seems there is no
> way
> > > to stop-it. Do you plan to add such feature?
> > >
> > > 7.       Do you plan to add functionality via JMX (starting/stopping
> > > compaction, splitting....)
> > >
> > > 8.       Finally there were some request for allowing custom
> compaction,
> > > part of this was given via the RegionObserver in HBASE-2001,
> nevertheless
> > > do you consider adding support for custom compaction (providing real
> > > pluggable compaction stategy not just observer)?
> > >
> > >
> > > Regards,
> > > Mikael.S
> > >
> > >
> >
> >
> > --
> > Mikael.S
> >
>



-- 
Mikael.S

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message