hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Leach <jle...@splicemachine.com>
Subject Re: Options to Import Data from MySql DB to HBase
Date Tue, 30 Aug 2016 13:13:45 GMT
I would suggest using an open source tool on top of HBase (Splice Machine, Trafodion, or Phoenix)
if you are wanting to map from an RDBMS.

John Leach

> On Aug 30, 2016, at 1:09 AM, Soumya Nayan Kar <soumyanayan.kar@gmail.com> wrote:
> I have a single table in MySql which contains around 24000000 records. I
> need a way to import this data into a table in HBase with multiple column
> families. I initially chose Sqoop as the tool to import the data but later
> found that I cannot use Sqoop to directly import the data as Sqoop does not
> support multiple column family import as yet. I have populated the data in
> HDFS using Sqoop from the MySql database. What are my choices to import
> this data from HDFSFS to HBase table with 3 column families? It seems for
> bulk import, I have two choices:
>   - ImportTSV tool: this probably requires the source data to be in TSV
>   format. But the data that I have imported in HDFS from MySql using Sqoop
>   seems to be in the CSV format. What is the standard solution for this
>   approach?
>   - Write a custom Map Reduce program to translate the data in HDFS to
>   HFile and load it into HBase.
> I just wanted to ensure that are these the only two choices available to
> load the data. This seems to be a bit restrictive given the fact that such
> a requirement is a very basic one in any system. If custom Map Reduce is
> the way to go, an example or working sample would be really helpful.

View raw message