hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Bulk import tools for HBase
Date Wed, 13 Oct 2010 03:30:31 GMT
I compiled under hbase trunk

jar tvf target/hbase-0.89.0-SNAPSHOT.jar | grep -i importt
  6072 Tue Oct 12 13:31:46 PDT 2010
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvImporter.class
   688 Tue Oct 12 13:31:46 PDT 2010
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser$BadTsvLineException.class
  1840 Tue Oct 12 13:31:46 PDT 2010
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser$ParsedLine.class
  3558 Tue Oct 12 13:31:46 PDT 2010
org/apache/hadoop/hbase/mapreduce/ImportTsv$TsvParser.class
  5787 Tue Oct 12 13:31:46 PDT 2010
org/apache/hadoop/hbase/mapreduce/ImportTsv.class

On Tue, Oct 12, 2010 at 6:09 PM, Leo Alekseyev <dnquark@gmail.com> wrote:

> Command line tools don't seem to be included in the 0.89.20100830 branch.
> In addition, it doesn't look like ImportTsv.java gets compiled into
> the Hbase jar file.
>
> Are there any tutorials for working with hbase source other than
> http://wiki.apache.org/hadoop/Hbase/MavenPrimer?
>
> Also, a somewhat naive question: do the bulk load tools assume that
> the source data already resides in HDFS?  If so, what efficient ways
> are there of loading bulk data into HDFS?
>
> On Mon, Oct 11, 2010 at 2:33 PM, Sean Bigdatafun
> <sean.bigdatafun@gmail.com> wrote:
> > Another potential "problem" of incremental bulk loader is that the number
> of
> > reducers (for the bulk loading process) needs to be equal to the existing
> > regions -- this seems to be unfeasible for very large table, say with
> 2000
> > regions.
> >
> > Any comment on this? Thanks.
> >
> > Sean
> >
> > On Fri, Oct 8, 2010 at 9:03 PM, Todd Lipcon <todd@cloudera.com> wrote:
> >
> >> What version are you building from? These tools are new as of this past
> >> june.
> >>
> >> -Todd
> >>
> >> On Fri, Oct 8, 2010 at 4:52 PM, Leo Alekseyev <dnquark@gmail.com>
> wrote:
> >>
> >>  > We want to investigate HBase bulk imports, as described on
> >> > http://hbase.apache.org/docs/r0.89.20100726/bulk-loads.html and
> and/or
> >> > JIRA HBASE-48.  I can't seem to run either the importtsv tool or the
> >> > completebulkload tool using the hadoop jar /path/to/hbase-VERSION.jar
> >> > command.  In fact, the ImportTsv class is not part of that jar file.
> >> > Am I looking in the wrong place for this class, or do I need to
> >> > somehow customize the build process to include it?..  Our HBase was
> >> > built from source using the default procedure.
> >> >
> >> > Thanks for any insight,
> >> > --Leo
> >> >
> >>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message