hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1901) "General" partitioner for "hbase-48" bulk (behind the api, write hfiles direct) uploader
Date Mon, 12 Oct 2009 22:59:31 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-1901:

    Attachment: 1901.patch

Here is first cut at a simpleorderpreserving partitioner.  I changed the hfile test output
format to use it though its not really suitable.  I updated the package doc. to discuss this
new class.

> "General" partitioner for "hbase-48" bulk (behind the api, write hfiles direct) uploader
> ----------------------------------------------------------------------------------------
>                 Key: HBASE-1901
>                 URL: https://issues.apache.org/jira/browse/HBASE-1901
>             Project: Hadoop HBase
>          Issue Type: Wish
>            Reporter: stack
>         Attachments: 1901.patch
> For users to bulk upload by writing hfiles directly to the filesystem, they currently
need to write a partitioner that is intimate with how their key schema works.  This issue
is about providing a general partitioner, one that could never be as fair as a custom-written
partitioner but that might just work for many cases.  The idea is that a user would supply
the first and last keys in their dataset to upload.  We'd then do bigdecimal on the range
between start and end rowids dividing it by the number of reducers to come up with key ranges
per reducer.
> (I thought jgray had done some BigDecimal work dividing keys already but I can't find

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message