Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A315D200B6F for ; Wed, 24 Aug 2016 23:09:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A1A57160AB1; Wed, 24 Aug 2016 21:09:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9897E160A91 for ; Wed, 24 Aug 2016 23:09:04 +0200 (CEST) Received: (qmail 65064 invoked by uid 500); 24 Aug 2016 21:09:03 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 65054 invoked by uid 99); 24 Aug 2016 21:09:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Aug 2016 21:09:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id C35A9C2F9D for ; Wed, 24 Aug 2016 21:09:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.198 X-Spam-Level: * X-Spam-Status: No, score=1.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id NXgoJjchKaQL for ; Wed, 24 Aug 2016 21:09:00 +0000 (UTC) Received: from mail-qk0-f172.google.com (mail-qk0-f172.google.com [209.85.220.172]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 8D1705F3F3 for ; Wed, 24 Aug 2016 21:09:00 +0000 (UTC) Received: by mail-qk0-f172.google.com with SMTP id l2so28117414qkf.3 for ; Wed, 24 Aug 2016 14:09:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=qPqFyUfEJMbmCNmq6hzKQXCMpwkNGvdVoK79U2gk7tU=; b=H/ygm/P8fKLucnSir0n0f+lq4axpfZAXfyN2VLyue4WRGuR5d2XoKaSW8cRnPdIrsV 02YqMiggfXuPZz6WdAVU+/u6xDicQSo4tm4BJCbouM6A+mw9C1xHTl4ueZOmfcWzSQab IaSoezVJNSGlimFl50ig0S0oVKEhATmVdfRMGmWghGHtwvG/ifqnfK171TtAd364+kqc 2JRrViu5/wxMbfHe0kBAbVJbbFWReSQDawP4SmZMGq3LY1iJ3AykJFyo1YkRVnLkRXp8 K/F8VZa+hr7FiuNZa3E6IePYkZpeprCGYbIgSGtcemp1hPz0Kv2rW38nxeZckZEtaMxb Pu9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=qPqFyUfEJMbmCNmq6hzKQXCMpwkNGvdVoK79U2gk7tU=; b=Dsln5jxB8FmKUkB2DRa6lrX3nFBu3su+KWAa733JT2LOB5sAA179xmsxnbFxvcA7YG rsKzAMKxTvaCEdxBlrYw1s06mZhdJh8VodvEIs5Nv/aXlw7hd7t51+BNVgOSnZvqSIxD 3eXovDFiakfMgXULYMiv5QTM6t3seV2WKkd7Ef4tgeWH+7MWDINzO2r7GMZqBb55VVx9 a/BTAVvnh4nECO+uvPNcMk6HyDgeE1bhbWuOBsG6BRg3DD/QoP2udLHPr5jxN7U1f+J5 U+RQuXgnYPWRi7V/QO3PIcdkS0ZotI7BDtYCEtGoYezyOgUJeBm8W3T4mXfSDuGNexd2 D6kw== X-Gm-Message-State: AE9vXwONKLgzl4M88Z72O+/jmv6RT5tcV3cQ5ybAaoSjNzItkZxbF9C7HqBMWlUQArJhqZFs27Cf4Cb/IY45pQ== X-Received: by 10.55.120.2 with SMTP id t2mr5867020qkc.62.1472072934303; Wed, 24 Aug 2016 14:08:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.67.136 with HTTP; Wed, 24 Aug 2016 14:08:53 -0700 (PDT) In-Reply-To: References: From: Mich Talebzadeh Date: Wed, 24 Aug 2016 22:08:53 +0100 Message-ID: Subject: Re: Loading Sybase to hive using sqoop To: user Content-Type: multipart/alternative; boundary=94eb2c05dd8c55302d053ad7b015 archived-at: Wed, 24 Aug 2016 21:09:05 -0000 --94eb2c05dd8c55302d053ad7b015 Content-Type: text/plain; charset=UTF-8 hm. Watching paint dry :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 24 August 2016 at 22:07, Rahul Channe wrote: > We are running hive on MR > > > On Wednesday, August 24, 2016, Mich Talebzadeh > wrote: > >> Sybase IQ uses jconn4.jar for ODBC connection. This is the way I use >> Spark to get IQ data into Hive table. You can specify partition in Sqoop as >> well. >> >> I started using Sqoop to populate Hive tables but decided to use Spark. >> >> Also are you running Hive on Map-reduce engine? >> >> private var dbURL = "jdbc:sybase:Tds:rhes564:21000/SYB_IQ" >> private var dbDatabase = null >> private var dbUserName ="loader" >> private var dbPassword = "xxxxxxxx" >> >> private var lowerBoundValue = "1" >> private var upperBoundValue = "100000000" >> private var numPartitionsValue = "100" // This is your partition >> number in Hive table >> // Get data from IQ table >> >> val d = HiveContext.read.format("jdbc").options( >> Map("url" -> dbURL, >> "dbtable" -> "dummy)", >> "partitionColumn" -> partitionColumnName, >> "lowerBound" -> lowerBoundValue, >> "upperBound" -> upperBoundValue, >> "numPartitions" -> numPartitionsValue, >> "user" -> dbUserName, >> "password" -> dbPassword)).load >> // Register it as a temp table >> d.registerTempTable("tmp") >> >> Insert into Hive table >> >> sqltext = """ >> INSERT INTO TABLE dummy >> SELECT >> ID >> , CLUSTERED >> , SCATTERED >> , RANDOMISED >> , RANDOM_STRING >> , SMALL_VC >> , PADDING >> FROM tmp >> """ >> HiveContext.sql(sqltext) >> >> >> HTH >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> * >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> On 23 August 2016 at 20:48, Rahul Channe wrote: >> >>> Hi All, >>> >>> We are trying to load data from Sybase Iq table to hive using sqoop. The >>> hive table is partitioned and expecting to hold 29M records per day. >>> >>> The sqoop job takes 7 hours to load 15 days of data, even while setting >>> the direct load option to 6. Hive is using MR framework. >>> >>> Is there is way to speed up the process. >>> >>> Note - the aim is to load 1 year of data >>> >> >> --94eb2c05dd8c55302d053ad7b015 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
hm. Watching paint dry :)
=
=

Dr Mich Talebzadeh

=C2=A0

LinkedIn =C2=A0https://www.linkedin.com/profile/view?id=3DAAEA= AAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

=C2=A0

http:= //talebzadehmich.wordpress.com


Disclaimer:=C2=A0Use = it=C2=A0at your own risk. Any and all responsibilit= y for any loss, damage or destruction of data or any other property which may arise from relying on this email= 9;s=C2=A0technical=C2=A0content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from = such loss, damage or destruction.

=C2=A0

<= font color=3D"#000000" face=3D"Times New Roman" size=3D"3">

On 24 August 2016 at 22:07, Rahul Channe <drahulc@googlemail.com> wrote:
We are running hive on MR


On Wednesday, August 24, 2016, Mich Talebzadeh <mich.talebzadeh@g= mail.com> wrote:
Sybase IQ= uses jconn4.jar for ODBC connection. This is the way I use Spark to get IQ= data into Hive table. You can specify partition in Sqoop as well.

I started using Sqoop to populate Hive tables but decided = to use Spark.

Also are you running Hive on Map-red= uce engine?

=C2=A0 private var dbURL =3D "jdbc:sybase:Tds:rhes564:2= 1000/SYB_IQ"
=C2=A0 private var dbDatabase =3D null
=C2=A0 = private var dbUserName =3D"loader"
=C2=A0 private var dbPasswo= rd =3D "xxxxxxxx"

=C2=A0 private var lowerBoundValue =3D "1"
=C2=A0= private var upperBoundValue =3D "100000000"
=C2=A0 private va= r numPartitionsValue =3D "100"=C2=A0 // This is your partition nu= mber in Hive table

=C2=A0 // Get data from IQ=C2=A0table

=C2=A0val d =3D HiveContext.read.format("jd= bc").options(
=C2=A0Map("url" -> dbURL,
=C2=A0= "dbtable" -> "dummy)",
=C2=A0"partitionColum= n" -> partitionColumnName,
=C2=A0"lowerBound" -> lo= werBoundValue,
=C2=A0"upperBound" -> upperBoundValue,
= =C2=A0"numPartitions" -> numPartitionsValue,
=C2=A0"us= er" -> dbUserName,
=C2=A0"password" -> dbPassword))= .load

// Regist= er it as a temp table =C2=A0
=C2=A0 d.registerTempTable("tmp")

I= nsert into Hive table

=C2=A0sqltext =3D """
=C2=A0 INS= ERT INTO TABLE dummy
=C2=A0 SELECT
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 ID
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , CL= USTERED
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , SCATTERED
=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , RANDOMISED
=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 , RANDOM_STRING
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 , SMALL_VC
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 , PADDIN= G
=C2=A0 FROM tmp
=C2=A0 """
=C2=A0=C2=A0 HiveConte= xt.sql(sqltext)


HTH


Dr Mich Talebzadeh

=C2=A0

LinkedIn =C2=A0https://www.linkedin.com/profile/view?id= =3DAAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

=C2=A0

http:= //talebzadehmich.wordpress.com


Disclaimer:=C2=A0= Use it=C2=A0at your own risk. Any and all responsib= ility for any loss, damage or destruction of data or any other property which may arise from relying on this email= 9;s=C2=A0technical=C2=A0content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from = such loss, damage or destruction.

=C2=A0

<= font color=3D"#000000" face=3D"Times New Roman" size=3D"3">

On 23 August 2016 at 20:48, Rahul Channe <drahulc@googlemail.com> wrote:
Hi All,

We are trying to load data from Syba= se Iq table to hive using sqoop. The hive table is partitioned and expectin= g to hold 29M records per day.

The sqoop job takes 7 ho= urs=C2=A0to load 15=C2=A0days=C2=A0of data, even while setting the direct l= oad option to 6. Hive is using MR framework.

Is there is way = to speed up the process.

Note - the aim is to load 1 yea= r of data


--94eb2c05dd8c55302d053ad7b015--