hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Bulk-load data into HBase using Storm
Date Wed, 04 Feb 2015 21:52:08 GMT
To use existing bulk load tools, you'll need to write a valid HFile to
HDFS (have a look at HFileWriterV{2,3}) and load it into the region
server(s) using the utilities provided in LoadIncrementalHFiles.

There's no way to do this "in memory" at the moment. Closest would be to
batch up your data into a single large RPC, but that's going through the
online machinery, memstore flush, &c.

On Wed, Feb 4, 2015 at 10:49 AM, Jaime Solano <jdjsolano@gmail.com> wrote:

> For a proof of concept we'll be working on, we want to bulk-load data into
> HBase, following a similar approach to the one explained here
> <
> http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
> >,
> but with the difference that for the HFile creation (step 2 in the
> mentioned article), we want to use Storm instead of MapReduce. That is, we
> want to bulk load data not sitting in HDFS, but probably in memory.
>    1. What are your thoughts about this? Is it feasible?
>    2. What challenges do you foresee?
>    3. What other approaches would you suggest?
> Thanks in advance,
> -Jaime

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message