hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manukranth Kolloju (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8185) Feature to enable Client Side Scanning(Client side merging) in HBase.
Date Sat, 23 Mar 2013 02:01:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611526#comment-13611526
] 

Manukranth Kolloju commented on HBASE-8185:
-------------------------------------------

Enis Soztutar, our idea was close to what you've mentioned. Creating a snapshot of a store
using hdfs hardlinks of the store files and thus use them to create this ReadOnlyStore and
leveraging the HBase Scanner Hierarchy to perform the merge. This will allow us to scan the
data present in hbase from a map reduce job where the cumulative cpu available from the map
reduce cluster is higher than that possible from the hbase cluster's region servers put together.
In cases where the hbase load is write intensive this will help in isolating the write load
from the read load.
                
> Feature to enable Client Side Scanning(Client side merging) in HBase.
> ---------------------------------------------------------------------
>
>                 Key: HBASE-8185
>                 URL: https://issues.apache.org/jira/browse/HBASE-8185
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>    Affects Versions: 0.89-fb
>            Reporter: Manukranth Kolloju
>             Fix For: 0.89-fb
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> The motivation of this was to enable the client to be able to open the region scanner(and
in turn open StoreScanners) and perform the merge on the client side. This will lower the
cpu ops that are consumed by the RegionServer since the data is pulled directly from the datanode.
In cases where the user is interested to perform a large scan on hbase data check-pointed
at a point of time, we think that ClientSideScan(ClientSideMerge) would give a very high throughput
as compared to using the ClientScanner in HTable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message