hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1416) Pool of commit loggers in each HRegionServer
Date Mon, 31 Aug 2009 08:01:41 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell updated HBASE-1416:

    Fix Version/s: 0.21.0

Up on hbase-dev@ Ryan writes:
we need to make hlog flush faster, it currently does only 700 ops/sec
when we flush every entry.

it'd be nice if we could do something clever, such as:

- use multiple logs
- detect multiple waiting clients and better batch their commits
- group commits for bulk import

This issue addresses the first point. 

While considering this, dynamically size the pool according to a concurrency measure. Spin
up new writers on demand until some configurable upper bound. A simple strategy to try first
might be 2 * ceil(log(load)), smoothed. Terminate excess writers at roll time to hold down
unnecessary HDFS resource use.

In HLog.doWrite we write each HLogKey and KeyValue to the log, which is a SequenceFile. Use
hfile instead? Can HFile do I/O batching? Otherwise I think to group commits we'd need to
introduce a new writable which bundles edits together. 

Moving into 0.21.

> Pool of commit loggers in each HRegionServer
> --------------------------------------------
>                 Key: HBASE-1416
>                 URL: https://issues.apache.org/jira/browse/HBASE-1416
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.21.0
> HBASE-1394 discusses pools of loggers as means of our being able to dump out the logs
faster; commit log is log pole in a write transaction.   This issue is about implementing
the pool of writers.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message