hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13159) Consider RangeReferenceFiles with transformations
Date Fri, 06 Mar 2015 22:47:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351071#comment-14351071
] 

Enis Soztutar commented on HBASE-13159:
---------------------------------------

Big +1. Was thinking along the same lines (excluding transformations). HFile link implements
soft links, and Reference files implement soft links with top/bottom limit. We can actually
benefit from another related concept, where an HFile contains information for multiple regions,
and can be referred from multiple regions (with region name + boundary). This is pretty important
for distributed log splitting, where we do not need to create so many small files per WAL
file (we end up creating regions x WAL files many files). I believe in the original paper,
BigTable achieves something like this with softlinks implemented via META (the regions files
are what is there in the meta, not what is there in the file system). A range on the reference
file will allow the region to be splittable after another split, but before compaction. It
also allows the region to be split into more than two pieces. 
For the transformation, I think even local index can make use of it. If we can make persist
the index data without the region start key prefix, we can apply an on-demand transformation
for adding the prefix to the cells, so that after local index region split, the data does
not have to be rewritten. 

> Consider RangeReferenceFiles with transformations
> -------------------------------------------------
>
>                 Key: HBASE-13159
>                 URL: https://issues.apache.org/jira/browse/HBASE-13159
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>
> Currently we have References used by HalfStoreReaders and HFileLinks.
> For various use cases we have here we have need for a RangeReferences with simple transformation
of the keys.
> That would allow us to map HFiles between regions or even tables without copying any
data.
> We can probably combine HalfStores, HFileLinks, and RangeReferences into a single concept:
> * RangeReference = arbitrary start and stop row, arbitrary key transformation
> * HFileLink = start and stop keys set to the linked file's start/stop key, transformation
= identity
> * (HalfStore) References = start/stop key set according to top or bottom reference, transformation
= identity
> Note this is a *brainstorming* issue. :)
> (Could start with just references with arbitrary start/stop keys, and do transformations
later)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message