hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Pratt (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8691) High-Throughput Streaming Scan API
Date Thu, 13 Jun 2013 20:49:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682705#comment-13682705

Sandy Pratt commented on HBASE-8691:

I noticed when doing some more testing that I'm returning only a partial Result in the stream
case, but the full Result in the control cases.  That means the stream result doesn't have
to transfer the rowkey (x3) and a couple of timestamps.  When I correct the test to return
the full Result, using DataOutputBuffer/DataInputBuffer to serialize and deserialize, the
stream case is about 2x faster than the RPC cases (rather than 4x faster).  I think it's still
worth looking at for a 2x speedup.

Unfortunately, I can't figure out how to make Hive work with an event driven interface.  Since
my customers use Hive primarily, I might have to put this on the back burner for now.  I hope
somebody finds it useful.
> High-Throughput Streaming Scan API
> ----------------------------------
>                 Key: HBASE-8691
>                 URL: https://issues.apache.org/jira/browse/HBASE-8691
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 0.95.0
>            Reporter: Sandy Pratt
>              Labels: perfomance, scan
>         Attachments: HRegionServlet.java, README.txt, RecordReceiver.java, ScannerTest.java,
StreamHRegionServer.java, StreamReceiverDirect.java, StreamServletDirect.java
> I've done some working testing various ways to refactor and optimize Scans in HBase,
and have found that performance can be dramatically increased by the addition of a streaming
scan API.  The attached code constitutes a proof of concept that shows performance increases
of almost 4x in some workloads.
> I'd appreciate testing, replication, and comments.  If the approach seems viable, I think
such an API should be built into some future version of HBase.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message