hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Bisciglia <>
Subject Announcing Sqoop: Database import for Hadoop
Date Mon, 01 Jun 2009 17:10:05 GMT
Hadoop Fans,

I'm happy to announce a new tool from the Cloudera team.

We often found our customers wanting to import data from RDBMSs so
they could conduct deeper analysis. To facilitate this, we built a
command line tool that allows you to extract data from any JDBC source
and build database-specific extensions to increase performance (we
ship with an improved MySQL extension that leverages mysqldump and
look forward to developing additional extensions with the community).

We affectionately refer to this tool as Sqoop: SQL to Hadoop. Sqoop is
available with the most recent update to Cloudera's Distribution for
Hadoop ( and has been contributed to
Apache as well.

You can use Sqoop to dump tables or entire databases to Hadoop. By
default, it uses DBInputFormat, generates all of the necessary Java
classes to work with your records, and also allows you to import data
directly into Hive.

You can get more details and see a video of Aaron Kimball's
presentation at last month's Hadoop User Group meeting at Y!:

Also, our upcoming intermediate training session in Washington DC will
cover Sqoop usage in detail:

Christophe and the Cloudera Team

get hadoop:
online training:

View raw message