hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1849) HTable doesn't work well at the core of a multi-threaded server; e.g. webserver
Date Sat, 14 Aug 2010 03:24:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898508#action_12898508

stack commented on HBASE-1849:

@BenĂ´it: Bring it on!

> HTable doesn't work well at the core of a multi-threaded server; e.g. webserver
> -------------------------------------------------------------------------------
>                 Key: HBASE-1849
>                 URL: https://issues.apache.org/jira/browse/HBASE-1849
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Benoit Sigoure
> HTable must do the following:
> + Sit in a shell or simple client -- e.g. Map or Reduce task -- and feed and read from
HBase single-threadedly.  It does this job OK.
> + Sit at core of a multithreaded server (100s of threads) -- a webserver or thrift gateway
-- and keep the throughput high. Its currently not good at this job.
> In the way of our achieving the second in the list above are the following:
> + HTable must seekout and cache region locations.  It keeps cache down in HConnectionManager.
 One is shared by all HTable instances if the HTable instance was made with same HBaseConfiguration
instance.   Lookups of regions is inside a synchronize block; if the region wanted is in the
cache, the lock is held a short time.   Otherwise, must wait till trip to server completed
(may require retries).  Meantime all other work is blocked even if we're using HTablePool.
> + Regardless of the identity of the HBaseConfiguration, Hadoop RPC has ONE Connection
open to a server at a time; request and response are multiplexed over this single connection.
> Broken stuff:
> + Puts are synchronized to protect the write buffer so only one thread at a time appends
but flushcommit is open for any thread to call it.  Once the write buffer is full, all Puts
block until its freed again. This looks like hang if hundreds of threads and each write is
to a random region in a big table and each write has to have its region looked-up (There may
be some other brokenness in here because this bottleneck seems to last longer than it should
even if hundreds of threads).
> Ideas:
> + Query of the cache does not block all access to the cache.  We only block access if
wanted region is being looked up so other reads and writes to regions we know the location
of can go ahead.
> + nio'd client and server

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message