hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: Fast importing into HBase (bypassing RegionServer)
Date Mon, 27 Jul 2009 20:56:40 GMT
The last time I seriously looked at this, it was to answer serious
performance issues with HBase.  I eventually fixed said performance
issues, and thus went on to drop the idea overall.

-ryan

On Mon, Jul 27, 2009 at 1:52 PM, stack<stack@duboce.net> wrote:
> Latest thinking is write a MR job that in the reducer writes hfiles that are
> just under a region size (<256M).  When reducer has reached about 240MB, it
> opens new file.  (May need to write custom ReduceRunner to keep account of
> whats been written and to rotate the file).
>
> After the MR has finished, a script would come along, move the hfiles into
> appropriate directory structure.  Each hfile would be the sole content of
> the region.  The script would read from each hfile's metadata its first and
> last keys and then using this metainfo along with a table format specified
> externally, insert an entry into .META. per region (See the scripts in bin
> -- copy and rename table -- for examples of how to manipulate .META.).
>
> Someone needs to just do it.  We've been talking about it for ever.
>
> St.Ack
> P.S. Here is older thinking on the topic
> https://issues.apache.org/jira/browse/HBASE-48
>
> On Mon, Jul 27, 2009 at 1:31 PM, tim robertson <timrobertson100@gmail.com>wrote:
>
>> Hi all,
>>
>> Ryan wrote on a different thread:
>>
>> "It should be possible to randomly insert data from a pre-existing
>> data set.  There is some work to directly import straight into hfiles
>> and skipping the regionserver, but that would only really work on 1
>> time imports to new tables."
>>
>> Could someone please elaborate on this a little and outline the steps
>> needed?  Do you write an hfile in a custom mapreduce output format and
>> then somehow write the table metadata file afterwards?
>>
>> Cheers,
>>
>> Tim
>>
>

Mime
View raw message