Return-Path: Delivered-To: apmail-hadoop-hive-user-archive@minotaur.apache.org Received: (qmail 8495 invoked from network); 1 Jun 2009 17:10:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Jun 2009 17:10:41 -0000 Received: (qmail 8358 invoked by uid 500); 1 Jun 2009 17:10:49 -0000 Delivered-To: apmail-hadoop-hive-user-archive@hadoop.apache.org Received: (qmail 8256 invoked by uid 500); 1 Jun 2009 17:10:49 -0000 Mailing-List: contact hive-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-user@hadoop.apache.org Delivered-To: mailing list hive-user@hadoop.apache.org Received: (qmail 8204 invoked by uid 99); 1 Jun 2009 17:10:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 17:10:40 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.221.110] (HELO mail-qy0-f110.google.com) (209.85.221.110) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2009 17:10:29 +0000 Received: by qyk8 with SMTP id 8so15920895qyk.5 for ; Mon, 01 Jun 2009 10:10:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.77.75 with SMTP id f11mr3970636vck.66.1243876207131; Mon, 01 Jun 2009 10:10:07 -0700 (PDT) Date: Mon, 1 Jun 2009 10:10:05 -0700 Message-ID: <69035570906011010h629bf1a2hbf9d0ac59ad2ceb8@mail.gmail.com> Subject: Announcing Sqoop: Database import for Hadoop From: Christophe Bisciglia To: core-user@hadoop.apache.org, hive-user@hadoop.apache.org, pig-user@hadoop.apache.org, hbase-user , zookeeper-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hadoop Fans, I'm happy to announce a new tool from the Cloudera team. We often found our customers wanting to import data from RDBMSs so they could conduct deeper analysis. To facilitate this, we built a command line tool that allows you to extract data from any JDBC source and build database-specific extensions to increase performance (we ship with an improved MySQL extension that leverages mysqldump and look forward to developing additional extensions with the community). We affectionately refer to this tool as Sqoop: SQL to Hadoop. Sqoop is available with the most recent update to Cloudera's Distribution for Hadoop (http://www.cloudera.com/hadoop) and has been contributed to Apache as well. You can use Sqoop to dump tables or entire databases to Hadoop. By default, it uses DBInputFormat, generates all of the necessary Java classes to work with your records, and also allows you to import data directly into Hive. You can get more details and see a video of Aaron Kimball's presentation at last month's Hadoop User Group meeting at Y!: http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/ Also, our upcoming intermediate training session in Washington DC will cover Sqoop usage in detail: http://www.eventbrite.com/event/351945679 Cheers, Christophe and the Cloudera Team -- get hadoop: cloudera.com/hadoop online training: cloudera.com/hadoop-training blog: cloudera.com/blog twitter: twitter.com/cloudera