hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeff saremi <jeffsar...@hotmail.com>
Subject Re: Writing/Importing large number of records into HBase
Date Sat, 28 Jan 2017 18:57:33 GMT
No iI had not.I will take a look. Thanks Ted


________________________________
From: Ted Yu <yuzhihong@gmail.com>
Sent: Friday, January 27, 2017 7:41 PM
To: user@hbase.apache.org
Subject: Re: Writing/Importing large number of records into HBase

Have you looked at hbase-spark module (currently in master branch) ?

See hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/example/datasources/AvroSource.scala
and hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/DefaultSourceSuite.scala
for examples.

There may be other options.

FYI

On Fri, Jan 27, 2017 at 7:28 PM, jeff saremi <jeffsaremi@hotmail.com> wrote:

> Hi
> I'm seeking some pointers/guidance on what we could do to insert billions
> of records that we already have in avro files in hadoop into HBase.
>
> I read some articles online and one of them recommended using HFile
> format. I took a cursory look at the documentation for that. Given the
> complexity of that I think that may be the last resort we want to pursue.
> Unless some library is out there that easily helps us write our files into
> that format. I didn't see any.
> Assuming that the Hbase native client may be our best bet, is there any
> advice around pre-paritioning our records or such techniques that we could
> use?
> thanks
>
> Jeff
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message