hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: Loading Sybase to hive using sqoop
Date Wed, 24 Aug 2016 21:21:55 GMT

Is your Sybase server ready to deliver a large amount of data? (Network, memory, cpu, parallel
access, resources etc) This is usually the problem when loading data from a relational database
 and less sqoop / mr or spark. 
Then, you should have a recent Hive version and store in Orc or parquet compressed (snappy).
Not in a text based format.
Another alternative would be to use one of the export tools supplied with Sybase and export
as a compressed file, put the file on HDFS and load it into Hive. This makes only sense if
the export tool by Sybase is outperforming a JDBC connection (can happen depending on the
relational database).

> On 23 Aug 2016, at 21:48, Rahul Channe <> wrote:
> Hi All,
> We are trying to load data from Sybase Iq table to hive using sqoop. The hive table is
partitioned and expecting to hold 29M records per day.
> The sqoop job takes 7 hours to load 15 days of data, even while setting the direct load
option to 6. Hive is using MR framework.
> Is there is way to speed up the process.
> Note - the aim is to load 1 year of data

View raw message