hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <mse...@segel.com>
Subject Re: Upgrading a coprocessor
Date Thu, 30 Oct 2014 09:22:27 GMT
There are several major problems with the current design of how hbase handles server side functionality.

This has been rehashed before and putting aside the security issues, the other major issue
is that you can't unload classes in Java. 
(Someone told me that they are working on this... but I would suspect it to be a couple of
years away.) 

So you have a couple of options:
 
* Rolling restart... never a good idea in a production environment, but it works today.
* Redesign your CP ... Depending on why you need to restart e.g. a static table changes you
can design around this.
* Don't use a CP. 

There's two other options:
* Redesign how HBase manages and implements CPs. (This will be a bit of work by the committers
and will most likely cause some rework by people who rely on CPs.) 
* Meet this half way by writing a very complex set of CPs that do this for you.

In short you write a CP that's in a couple of parts. 

First part is the agent that is added to the RS like any other CP. 
It will capture the event and forward it on to the second part. 

Second part is a framework that manages sandboxes. Depending on the actual CP it would get
placed in a sandbox to perform its work. 
Message passing between agent and framework could be done a couple of ways. e.g. , a shared
memory segment.(C w Java wrappers)  , durable message queues, memory mapped files,...  you
choice. 
The framework not only manages the messages, but also the individual CPs. (load/unload, enable,
disable) and which sandbox to locate the CP. 

The third part is the CP itself. 

In theory, pretty straight forward. 

In implementation... can get a bit complex depending on what features you want to implement.


Now you can manage the CP in a separate JVM. 

MapR kinda skirts this problem in their MapRDB product, but I haven't had enough time to look
at it yet. 



On Oct 29, 2014, at 4:25 PM, Pradeep Gollakota <pradeepg26@gmail.com> wrote:

> At Lithium, we power Klout using HBase. We load Klout scores for about 500
> million users into HBase every night. When a load is happening, we noticed
> that the performance of klout.com was severely degraded. We also see
> severely degraded performance when performing operations like compactions.
> In order to mitigate this, we stood up 2 HBase cluster in an
> "Active/Standy" configuration (not the built in replication, but something
> else entirely). We serve data from the "Active" cluster and load data into
> the "Standby" and then swap, load into the other cluster while serving from
> the cluster that just got the update.
> 
> We don't use coprocessors, so we didn't have the problem you're describing.
> However, in our configuration, what we would do is upgrade the coprocessor
> in the "Standby" and then swap the clusters. But since you would have to
> stand up a second HBase cluster, this may be a non-starter for you. Just
> another option thrown into the mix. :)
> 
> On Wed Oct 29 2014 at 12:07:02 PM Michael Segel <msegel@segel.com> wrote:
> 
>> Well you could redesign your cp.
>> 
>> There is a way to work around the issue by creating a cp that's really a
>> framework and then manage the cps in a different jvm(s) using messaging
>> between the two.
>> So if you want to reload or restart your cp, you can do it outside of the
>> RS.
>> 
>> Its a bit more work...
>> 
>> 
>> On Oct 29, 2014, at 9:21 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> 
>>> Rolling restart of servers may have bigger impact on operations - server
>>> hosting hbase:meta would be involved which has more impact compared to
>>> disabling / enabling user table.
>>> 
>>> You should give ample timeout to your client. The following is an
>>> incomplete list of configs (you can find their explanation on
>>> http://hbase.apache.org/book.html):
>>> 
>>> hbase.client.scanner.timeout.period
>>> hbase.rpc.timeout
>>> 
>>> Cheers
>>> 
>>> On Tue, Oct 28, 2014 at 11:18 PM, Hayden Marchant <haydenm@amobee.com>
>>> wrote:
>>> 
>>>> Thanks all for confirming what I thought was happening.
>>>> 
>>>> I am considering implementing a pattern similar to Iain's in that I
>>>> version that path of the cp, and disable/enable the table while
>> upgrading
>>>> the cp metadata.
>>>> 
>>>> However, what are the operational considerations of disabling a table
>> for
>>>> a number of seconds, versus rolling restart of region servers? Assuming
>>>> that however hard I try, there still might be a process or 2 that are
>>>> accessing that table at that time. What sort of error handling will I
>> need
>>>> to more aware of now (I assume that MapReduce would recover from either
>> of
>>>> these two strategies?)
>>>> 
>>>> Thanks,
>>>> Hayden
>>>> 
>>>> ________________________________________
>>>> From: iain wright <iainwrig@gmail.com>
>>>> Sent: Wednesday, October 29, 2014 1:51 AM
>>>> To: user@hbase.apache.org
>>>> Subject: Re: Upgrading a coprocessor
>>>> 
>>>> Hi Hayden,
>>>> 
>>>> We ran into the same thing & ended up going with a rudimentary cp deploy
>>>> script for appending epoch to the cp name, placing on hdfs, and
>>>> disabling/modifying hbase table/enabling
>>>> 
>>>> Heres the issue for this: https://issues.apache.org/
>> jira/browse/HBASE-9046
>>>> 
>>>> -
>>>> 
>>>> --
>>>> Iain Wright
>>>> 
>>>> This email message is confidential, intended only for the recipient(s)
>>>> named above and may contain information that is privileged, exempt from
>>>> disclosure under applicable law. If you are not the intended recipient,
>> do
>>>> not disclose or disseminate the message to anyone except the intended
>>>> recipient. If you have received this message in error, or are not the
>> named
>>>> recipient(s), please immediately notify the sender by return email, and
>>>> delete all copies of this message.
>>>> 
>>>> On Tue, Oct 28, 2014 at 10:51 AM, Bharath Vissapragada <
>>>> bharathv@cloudera.com> wrote:
>>>> 
>>>>> Hi Hayden,
>>>>> 
>>>>> Currently there is no workaround. We can't unload already loaded
>> classes
>>>>> unless we make changes to Hbase's classloader design and I believe its
>>>> not
>>>>> that trivial.
>>>>> 
>>>>> - Bharath
>>>>> 
>>>>> On Tue, Oct 28, 2014 at 2:52 AM, Hayden Marchant <haydenm@amobee.com>
>>>>> wrote:
>>>>> 
>>>>>> I have been using a RegionObserver coprocessor on my HBase 0.94.6
>>>> cluster
>>>>>> for quite a while and it works great. I am currently upgrading the
>>>>>> functionality. When doing some testing in our integration environment
>> I
>>>>> met
>>>>>> with the issue that even when I uploaded a new version of my
>>>> coprocessor
>>>>>> jar to HDFS, HBase did not recognize it, and it kept using the old
>>>>> version.
>>>>>> 
>>>>>> I even disabled/reenabled the table - no help. Even with a new table,
>>>> it
>>>>>> still loads old class. Only when I changed the location of the jar
in
>>>>> HDFS,
>>>>>> did it load the new version.
>>>>>> 
>>>>>> I looked at the source code of CoprocessorHost and I see that it
is
>>>>>> forever holding a classloaderCache with no mechanism for clearing
it
>>>> out.
>>>>>> 
>>>>>> I assume that if I restart the region server it will take the new
>>>> version
>>>>>> of my coprocessor.
>>>>>> 
>>>>>> Is there any workaround for upgrading a coprocessor without either
>>>>>> changing the path, or restarting the HBase region server?
>>>>>> 
>>>>>> Thanks,
>>>>>> Hayden
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Bharath Vissapragada
>>>>> <http://www.cloudera.com>
>>>>> 
>>>> 
>> 
>> 


Mime
View raw message