hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-48) [hbase] Bulk load and dump tools
Date Wed, 06 Feb 2008 23:21:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566394#action_12566394

stack commented on HBASE-48:

Yes.  Going behind the API would be the faster way to load hbase.   It'd be dangerous to do
into a live hbase.   Should we write something like TableOutputFormatter except it writes
region files directly into hdfs?   It'd make a region per reducer instance?  It'd know how
to write keys, etc. properly and what location in hdfs to place files.

> [hbase] Bulk load and dump tools
> --------------------------------
>                 Key: HBASE-48
>                 URL: https://issues.apache.org/jira/browse/HBASE-48
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Priority: Minor
> Hbase needs tools to facilitate bulk upload and possibly dumping.  Going via the current
APIs, particularly if the dataset is large and cell content is small, uploads can take a long
time even when using many concurrent clients.
> PNUTS folks talked of need for a different API to manage bulk upload/dump.
> Another notion would be to somehow have the bulk loader tools somehow write regions directly
in hdfs.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message