hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Segel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12853) distributed write pattern to replace ad hoc 'salting'
Date Wed, 29 Jul 2015 16:22:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646347#comment-14646347

Michael Segel  commented on HBASE-12853:


As I have said before... Apache doesn't indemnify committers (actually its the reverse) and
there is no upside for me to offset the risk. 

In a nutshell it would be pointless in having a discussion on why I used the term trivial
and why I rated this as a low priority. 

BTW, there are 11 watchers... why don't you ask those watchers who are also committers and
leaders of the HBase project, why they didn't raise the priority? 

I don't wish to seem rude, but if you're going to lecture someone, you had better realize
that some will ignore you, others will mock you... 

To your point, this was the first JIRA that I raised.  I assumed that those who volunteer
their time would also take the time to assess the value of the suggestion.  Clearly not. 
That was my mistake. 

To be honest, I lack the patience to suffer fools...  

> distributed write pattern to replace ad hoc 'salting'
> -----------------------------------------------------
>                 Key: HBASE-12853
>                 URL: https://issues.apache.org/jira/browse/HBASE-12853
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Michael Segel 
>             Fix For: 2.0.0
> In reviewing HBASE-11682 (Description of Hot Spotting), one of the issues is that while
'salting' alleviated  regional hot spotting, it increased the complexity required to utilize
the data.  
> Through the use of coprocessors, it should be possible to offer a method which distributes
the data on write across the cluster and then manages reading the data returning a sort ordered
result set, abstracting the underlying process. 
> On table creation, a flag is set to indicate that this is a parallel table. 
> On insert in to the table, if the flag is set to true then a prefix is added to the key.
 e.g. <region server#>- or <region server #|| where the region server # is an integer
between 1 and the number of region servers defined.  
> On read (scan) for each region server defined, a separate scan is created adding the
prefix. Since each scan will be in sort order, its possible to strip the prefix and return
the lowest value key from each of the subsets. 

This message was sent by Atlassian JIRA

View raw message