hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Why is a single INSERT very slow in Hive?
Date Mon, 11 Sep 2017 19:18:40 GMT
Why do you want to do single inserts? 
It has been more designed for bulk loads.
In any case newer version of Hive 2 using TEZ +llap improve it significantly (also for bulk
analysis). Nevertheless, it is good practice to not use single inserts in an analysis systems,
but try to combine and bulk-load them.

> On 11. Sep 2017, at 21:01, Jinhui Qin <qin.jinhui@gmail.com> wrote:
> 
>  
> 
> Hi, 
> I am new to Hive. I just created a simple table in hive and inserted two records, the
first insertion took 16.4 sec, while the second took 14.3 sec. Why is that very slow? is this
the normal performance you get in Hive using INSERT ? Is there a way to improve the performance
of a single "insert" in Hive? Any help would be really appreciated. Thanks!
> 
> Here is the record from a terminal in Hive shell:
> 
> =========================
> 
> hive> show tables;
> OK
> Time taken: 2.758 seconds
> hive> create table people(id int, name string, age int);
> OK
> Time taken: 0.283 seconds
> hive> insert into table people(1,'Tom A', 20);
> Query ID = hive_20170911134052_04680c79-432a-43e0-827b-29a4212fbbc0
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1505146047428_0098, Tracking URL = http://iop-hadoop-bi.novalocal:8088/proxy/application_1505146047428_0098/
> Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job  -kill job_1505146047428_0098
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
> 2017-09-11 13:41:01,492 Stage-1 map = 0%,  reduce = 0%
> 2017-09-11 13:41:06,940 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.7 sec
> MapReduce Total cumulative CPU time: 2 seconds 700 msec
> Ended Job = job_1505146047428_0098
> Stage-4 is selected by condition resolver.
> Stage-3 is filtered out by condition resolver.
> Stage-5 is filtered out by condition resolver.
> Moving data to: hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/warehouse/people/.hive-staging_hive_2017-09-11_13-40-52_106_462156758110461544
> 1-1/-ext-10000
> Loading data to table default.people
> Table default.people stats: [numFiles=1, numRows=1, totalSize=11, rawDataSize=10]
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.7 sec   HDFS Read: 3836 HDFS Write: 81 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 700 msec
> OK
> Time taken: 16.417 seconds
> hive> insert into table people values(1,'Tom A', 20);
> Query ID = hive_20170911134128_c8f46977-7718-4496-9a98-cce0f89ced79
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1505146047428_0099, Tracking URL = http://iop-hadoop-bi.novalocal:8088/proxy/application_1505146047428_0099/
> Kill Command = /usr/iop/4.1.0.0/hadoop/bin/hadoop job  -kill job_1505146047428_0099
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
> 2017-09-11 13:41:36,289 Stage-1 map = 0%,  reduce = 0%
> 2017-09-11 13:41:40,721 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.28 sec
> MapReduce Total cumulative CPU time: 2 seconds 280 msec
> Ended Job = job_1505146047428_0099
> Stage-4 is selected by condition resolver.
> Stage-3 is filtered out by condition resolver.
> Stage-5 is filtered out by condition resolver.
> Moving data to: hdfs://iop-hadoop-bi.novalocal:8020/apps/hive/warehouse/people/.hive-staging_hive_2017-09-11_13-41-28_757_445847252207124056
> 7-1/-ext-10000
> Loading data to table default.people
> Table default.people stats: [numFiles=2, numRows=2, totalSize=22, rawDataSize=20]
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.28 sec   HDFS Read: 3924 HDFS Write: 81 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 280 msec
> OK
> Time taken: 14.288 seconds
> hive> exit;
> =================
> 
>  
> Jinhui
> 
> 

Mime
View raw message