hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaxin Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-3813) Change RPC callQueue size from "handlerCount * MAX_QUEUE_SIZE_PER_HANDLER;"
Date Tue, 16 Feb 2016 16:55:18 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jiaxin Li updated HBASE-3813:
-----------------------------
    Description: 
Yesterday debugging w/ Jack we noticed that with few handlers on a big box, he was seeing
stats like this:

{code}
2011-04-21 11:54:49,451 DEBUG org.apache.hadoop.ipc.HBaseServer: Server connection from X.X.X.X:60931;
# active connections: 11; # queued calls: 2500
{code}

We had 2500 items in the rpc queue waiting to be processed.

Turns out he had too few handlers for number of clients (but also, it seems like he figured
hw issues in that his RAM bus was running at 1/4 the rate that it should have been running
at).

Chatting w/ J-D this morning, he asked if the queues hold 'data'.  The queues hold 'Calls'.
 Calls are the client request.  They contain data.

Jack had 2500 items queued.  If each item to insert was 1MB, thats 2.5k * 1MB of memory that
is outside of our generally accounting.

Currently the queue size is handlers * MAX_QUEUE_SIZE_PER_HANDLER where MAX_QUEUE_SIZE_PER_HANDLER
is hardcoded to be 100.

If the queue is full we block (LinkedBlockingQueue).

Going to change the queue size from 100 to 10 by default -- but also will make it configurable
and will doc. this as possible cause of OOME.  Will try it on production here before committing
patch.



  was:
Yesterday debugging w/ Jack we noticed that with few handlers on a big box, he was seeing
stats like this:

{code}
2011-04-21 11:54:49,451 DEBUG org.apache.hadoop.ipc.HBaseServer: Server connection from X.X.X.X:60931;
# active connections: 11; # queued calls: 2500
{code}

We had 2500 items in the rpc queue waiting to be processed.

Turns out he had too few handlers for number of clients (but also, it seems like he figured
hw issues in that his RAM bus was running at 1/4 the rate that it should have been running
at).

Chatting w/ J-D this morning, he asked if the queues hold 'data'.  The queues hold 'Calls'.
 Calls are the client request.  They contain data.

Jack had 2500 items queued.  If each item to insert was 1MB, thats 25k * 1MB of memory that
is outside of our generally accounting.

Currently the queue size is handlers * MAX_QUEUE_SIZE_PER_HANDLER where MAX_QUEUE_SIZE_PER_HANDLER
is hardcoded to be 100.

If the queue is full we block (LinkedBlockingQueue).

Going to change the queue size from 100 to 10 by default -- but also will make it configurable
and will doc. this as possible cause of OOME.  Will try it on production here before committing
patch.




> Change RPC callQueue size from "handlerCount * MAX_QUEUE_SIZE_PER_HANDLER;"
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-3813
>                 URL: https://issues.apache.org/jira/browse/HBASE-3813
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 3813.txt
>
>
> Yesterday debugging w/ Jack we noticed that with few handlers on a big box, he was seeing
stats like this:
> {code}
> 2011-04-21 11:54:49,451 DEBUG org.apache.hadoop.ipc.HBaseServer: Server connection from
X.X.X.X:60931; # active connections: 11; # queued calls: 2500
> {code}
> We had 2500 items in the rpc queue waiting to be processed.
> Turns out he had too few handlers for number of clients (but also, it seems like he figured
hw issues in that his RAM bus was running at 1/4 the rate that it should have been running
at).
> Chatting w/ J-D this morning, he asked if the queues hold 'data'.  The queues hold 'Calls'.
 Calls are the client request.  They contain data.
> Jack had 2500 items queued.  If each item to insert was 1MB, thats 2.5k * 1MB of memory
that is outside of our generally accounting.
> Currently the queue size is handlers * MAX_QUEUE_SIZE_PER_HANDLER where MAX_QUEUE_SIZE_PER_HANDLER
is hardcoded to be 100.
> If the queue is full we block (LinkedBlockingQueue).
> Going to change the queue size from 100 to 10 by default -- but also will make it configurable
and will doc. this as possible cause of OOME.  Will try it on production here before committing
patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message