Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 17013 invoked from network); 26 May 2009 17:39:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 May 2009 17:39:59 -0000 Received: (qmail 92214 invoked by uid 500); 26 May 2009 17:40:10 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 92097 invoked by uid 500); 26 May 2009 17:40:09 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 92073 invoked by uid 99); 26 May 2009 17:40:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 May 2009 17:40:09 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 May 2009 17:40:06 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B846C234C1ED for ; Tue, 26 May 2009 10:39:45 -0700 (PDT) Message-ID: <1784001331.1243359585754.JavaMail.jira@brutus> Date: Tue, 26 May 2009 10:39:45 -0700 (PDT) From: "Aaron Kimball (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-5844) Use mysqldump when connecting to local mysql instance in Sqoop In-Reply-To: <1816978832.1242348406015.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated HADOOP-5844: ---------------------------------- Status: Patch Available (was: Open) Cycling the patch status now that 5815 is in to actually test this > Use mysqldump when connecting to local mysql instance in Sqoop > -------------------------------------------------------------- > > Key: HADOOP-5844 > URL: https://issues.apache.org/jira/browse/HADOOP-5844 > Project: Hadoop Core > Issue Type: New Feature > Reporter: Aaron Kimball > Assignee: Aaron Kimball > Attachments: mysqldump.patch > > > Sqoop uses MapReduce + DBInputFormat to read the contents of a table into HDFS. On many databases, this implementation is O(N^2) in the number of rows. Also, the use of multiple mappers has low value in terms of throughput, because the database itself is inherently singlethreaded. While DBInputFormat/JDBC provides a useful fallback mechanism for importing from databases, db-specific dump utilities will nearly always provide faster throughput, and should be selected when available. This patch allows users to use mysqldump to read from local mysql instances instead of the MapReduce-based input. > If you provide sqoop with arguments of the form " --connect jdbc:mysql://localhost/somedatabase --local", it will use the mysqldump fast path to perform the import. > This patch, naturally, requires that MySQL be installed on a machine to test it. Thus the test that this adds is called LocalMySQLTest (instead of the Hadoop-preferred file naming, TestLocalMySQL) so that Hudson doesn't automatically run it. You can run this test yourself by using "ant -Dtestcase=LocalMySQLTest test". See the notes in the javadoc for the LocalMySQLTest class in how to set up the MySQL test environment for this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.