hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Soumya Nayan Kar <soumyanayan....@gmail.com>
Subject Options to Import Data from MySql DB to HBase
Date Tue, 30 Aug 2016 06:09:34 GMT
I have a single table in MySql which contains around 24000000 records. I
need a way to import this data into a table in HBase with multiple column
families. I initially chose Sqoop as the tool to import the data but later
found that I cannot use Sqoop to directly import the data as Sqoop does not
support multiple column family import as yet. I have populated the data in
HDFS using Sqoop from the MySql database. What are my choices to import
this data from HDFSFS to HBase table with 3 column families? It seems for
bulk import, I have two choices:

   - ImportTSV tool: this probably requires the source data to be in TSV
   format. But the data that I have imported in HDFS from MySql using Sqoop
   seems to be in the CSV format. What is the standard solution for this
   - Write a custom Map Reduce program to translate the data in HDFS to
   HFile and load it into HBase.

I just wanted to ensure that are these the only two choices available to
load the data. This seems to be a bit restrictive given the fact that such
a requirement is a very basic one in any system. If custom Map Reduce is
the way to go, an example or working sample would be really helpful.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message