hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Khatri <chetan.opensou...@gmail.com>
Subject Re: Writing/Importing large number of records into HBase
Date Sat, 28 Jan 2017 04:15:26 GMT
Oh. Sorry.
https://github.com/apache/hbase/blob/master/hbase-spark/src/main/java/org/apache/hadoop/hbase/spark/example/hbasecontext/JavaHBaseBulkPutExample.java

On Sat, Jan 28, 2017 at 9:27 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Chetan:
> The link you posted was from personal repo.
>
> There hasn't been commit for at least a year.
>
> Meanwhile, the hbase-spark module in hbase repo is being actively
> maintained.
>
> FYI
>
> > On Jan 27, 2017, at 7:47 PM, Chetan Khatri <chetan.opensource@gmail.com>
> wrote:
> >
> > Adding to @Ted Check Bulk Put Example -
> > https://github.com/tmalaska/SparkOnHBase/blob/master/src/
> main/scala/org/apache/hadoop/hbase/spark/example/hbasecontext/
> HBaseBulkPutExampleFromFile.scala
> >
> >> On Sat, Jan 28, 2017 at 9:11 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >>
> >> Have you looked at hbase-spark module (currently in master branch) ?
> >>
> >> See hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/
> >> example/datasources/AvroSource.scala
> >> and hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/
> >> DefaultSourceSuite.scala
> >> for examples.
> >>
> >> There may be other options.
> >>
> >> FYI
> >>
> >> On Fri, Jan 27, 2017 at 7:28 PM, jeff saremi <jeffsaremi@hotmail.com>
> >> wrote:
> >>
> >>> Hi
> >>> I'm seeking some pointers/guidance on what we could do to insert
> billions
> >>> of records that we already have in avro files in hadoop into HBase.
> >>>
> >>> I read some articles online and one of them recommended using HFile
> >>> format. I took a cursory look at the documentation for that. Given the
> >>> complexity of that I think that may be the last resort we want to
> pursue.
> >>> Unless some library is out there that easily helps us write our files
> >> into
> >>> that format. I didn't see any.
> >>> Assuming that the Hbase native client may be our best bet, is there any
> >>> advice around pre-paritioning our records or such techniques that we
> >> could
> >>> use?
> >>> thanks
> >>>
> >>> Jeff
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message