hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Alekseyev <dnqu...@gmail.com>
Subject Re: Bulk import tools for HBase
Date Wed, 13 Oct 2010 01:09:39 GMT
Command line tools don't seem to be included in the 0.89.20100830 branch.
In addition, it doesn't look like ImportTsv.java gets compiled into
the Hbase jar file.

Are there any tutorials for working with hbase source other than
http://wiki.apache.org/hadoop/Hbase/MavenPrimer?

Also, a somewhat naive question: do the bulk load tools assume that
the source data already resides in HDFS?  If so, what efficient ways
are there of loading bulk data into HDFS?

On Mon, Oct 11, 2010 at 2:33 PM, Sean Bigdatafun
<sean.bigdatafun@gmail.com> wrote:
> Another potential "problem" of incremental bulk loader is that the number of
> reducers (for the bulk loading process) needs to be equal to the existing
> regions -- this seems to be unfeasible for very large table, say with 2000
> regions.
>
> Any comment on this? Thanks.
>
> Sean
>
> On Fri, Oct 8, 2010 at 9:03 PM, Todd Lipcon <todd@cloudera.com> wrote:
>
>> What version are you building from? These tools are new as of this past
>> june.
>>
>> -Todd
>>
>> On Fri, Oct 8, 2010 at 4:52 PM, Leo Alekseyev <dnquark@gmail.com> wrote:
>>
>>  > We want to investigate HBase bulk imports, as described on
>> > http://hbase.apache.org/docs/r0.89.20100726/bulk-loads.html and and/or
>> > JIRA HBASE-48.  I can't seem to run either the importtsv tool or the
>> > completebulkload tool using the hadoop jar /path/to/hbase-VERSION.jar
>> > command.  In fact, the ImportTsv class is not part of that jar file.
>> > Am I looking in the wrong place for this class, or do I need to
>> > somehow customize the build process to include it?..  Our HBase was
>> > built from source using the default procedure.
>> >
>> > Thanks for any insight,
>> > --Leo
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Mime
View raw message