Return-Path: X-Original-To: apmail-sqoop-dev-archive@www.apache.org Delivered-To: apmail-sqoop-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14EDAD231 for ; Sat, 3 Nov 2012 05:26:14 +0000 (UTC) Received: (qmail 28930 invoked by uid 500); 3 Nov 2012 05:26:13 -0000 Delivered-To: apmail-sqoop-dev-archive@sqoop.apache.org Received: (qmail 28830 invoked by uid 500); 3 Nov 2012 05:26:13 -0000 Mailing-List: contact dev-help@sqoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@sqoop.apache.org Delivered-To: mailing list dev@sqoop.apache.org Received: (qmail 28809 invoked by uid 99); 3 Nov 2012 05:26:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 03 Nov 2012 05:26:13 +0000 Date: Sat, 3 Nov 2012 05:26:12 +0000 (UTC) From: "Abhijeet Gaikwad (JIRA)" To: dev@sqoop.apache.org Message-ID: <432107365.64286.1351920373026.JavaMail.jiratomcat@arcas> In-Reply-To: <2038318197.85563.1347811687765.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (SQOOP-604) Easy throttling feature for MySQL exports MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhijeet Gaikwad updated SQOOP-604: ----------------------------------- Affects Version/s: (was: 1.4.3) 1.4.2 > Easy throttling feature for MySQL exports > ----------------------------------------- > > Key: SQOOP-604 > URL: https://issues.apache.org/jira/browse/SQOOP-604 > Project: Sqoop > Issue Type: Improvement > Components: connectors/mysql > Affects Versions: 1.4.2 > Reporter: Zoltan Toth-Czifra > Priority: Minor > Fix For: 1.4.3 > > Attachments: SQOOP-604_v6.patch > > > Sqoop always tries to achieve the best possible throughput with exports, which might not be desirable in all cases. Sometimes we need to export large data with Sqoop to a live relational database (MySQL in our case), that is, a database that is under a high load serving random queries from the users of our product. > While data consistency issues during the export can be easily solved with a staging table, there is still a problem: the performance impact caused by the heavy export. > First off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively "cheap"), importing big tables (GB+) can cause serious replication lag in the cluster risking data consistency. > My suggestion is quite simple. Using the already existing "checkpoint" feature of the MySQL exports (the export process is restarted every X bytes written), extending it with a new config value that would simply make the thread sleep for X milliseconds at the checkbpoints. With low enough byte count limit this can be a simple yet powerful throttling mechanism. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira