hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Brown <tombrow...@gmail.com>
Subject Re: Tracking down coprocessor pauses
Date Wed, 12 Sep 2012 17:40:40 GMT
I have captured some logs from what is happening during one of these pauses.


Can someone help me figure out what's actually going on from these logs?

--- My interpretation of the logs ---

As you can see at the start of the logs, my coprocessor for updating
the data is executing rapidly until 10:17:06.

At that time the coprocessor for querying is invoked. This query
should take only moments to return, but doesn't return until 10:44:52.

At 10:18:53 there appear to be some compaction related messages
(though they didn't appear to be the cause, happening over a minute
after the server stops functioning).

It appears to run compaction until 10:42:25. The next two minutes
contain just LRU eviction messages.

At 10:44:52, the query from earlier appears to complete, after having
summarized only 863 rows. A few other queued requests are attempted,
but fail with exceptions (ClosedChannelException).

Eventually the exceptions are being thrown from "openScanner", which
really doesn't sound good to me.


On Mon, Sep 10, 2012 at 11:32 AM, Tom Brown <tombrown52@gmail.com> wrote:
> Hi,
> We have our system setup such that all interaction is done through
> co-processors. We update the database via a co-processor (it has the
> appropriate logic for dealing with concurrent access to rows), and we
> also query/aggregate via co-processor (since we don't want to send all
> the data over the network).
> This generally works very well. However, some times one of the region
> servers will "pause". This doesn't appear to be a GC pause since it
> still serves up the UI, and adds occasional messages to the log
> regarding the LRU. The only thing I've found is that when I check the
> server that's causing the problem (easy to tell, since all the
> "working" servers have a low load, and the problem server has a higher
> load), I can see that there are a number of execCoprocessor requests
> that have been executing for much longer than they should.
> I want to know more details about the specifics of those requests; Is
> there an API I can use that will allow my coprocessor requests to be
> tracked more functionally? Is there a way to hook into the UI so I can
> provide my own list of running processes? Or would I have to write
> that all myself?
> I am using HBase 0.92.1, but will be upgrading to 0.94.1 soon.
> Thanks in advance!
> --Tom

View raw message