hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brotanek, Jan" <Jan.Brota...@adastragrp.com>
Subject RE: tez session timesout?
Date Mon, 16 Jan 2017 11:25:34 GMT
Seems TeZ is spawning many processes and using all file descriptors, causing Unix to temporarily
run out of resources. 

I suppose this may be the problem, but don't know why it doesn't happen when 2nd query is
invoked. It always fails on 3rd query.

Is there any settings which can prevent this behaviour? 

Any help much appreciated!

-bash: fork: retry: Resource temporarily unavailable

-bash: ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127808
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


-----Original Message-----
From: Sergey Shelukhin [mailto:sergey@hortonworks.com] 
Sent: čtvrtek 12. ledna 2017 18:12
To: user@hive.apache.org
Subject: Re: tez session timesout?

That should only happen when InetAddress.getLocalHost().getHostName()
throws UnknownHostException… do you have any other suspicious logs or activity around that
time?

On 17/1/12, 07:54, "Brotanek, Jan" <Jan.Brotanek@adastragrp.com> wrote:

>Hello, I am running insert statement via CLI interface under Hive on 
>Tez on HDP 2.4.0.:
>
>hive -hiveconf hive.cli.errors.ignore=true -v -f hive_pl_new7.sql
>
>hive_pl_new7.sql consists of couple of insert into partition statements 
>which take quite long time - about 1200s each.
>
>    insert into table table (part_col = '2015-12')
>    select col1, col2
>    from table
>    where col2 >= '2015-12-01 00:00:00'
>    and col2 <= '2015-12-31 23:59:59';
>
>    insert into table table (part_col = '2016-01')
>    select col1, col2
>    from table
>    where col2 >= '2016-01-01 00:00:00'
>    and col2 <= '2016-01-31 23:59:59';
>
>    insert into table table (part_col = '2016-02')
>    select col1, col2
>    from table
>    where col2 >= '2016-02-01 00:00:00'
>    and col2 <= '2016-02-31 23:59:59';
>
>First two statements run just fine. When 3rd is launched, I get 
>following error. There are no syntax/semantic errors in statements, I tested that.
>When using execution engine MR, it runs just fine. This is serious 
>issue for running automatical batch jobs. Can anyone explain?
>
>Versions:
>Hive 1.2.1000.2.4.0.0-169
>HDP: 2.4.0
>Hadoop 2.7.1
>Hcatalog: 1.2.1
>Hbase: 1.1.2
>
>    Exception in thread "main" java.lang.RuntimeException: Unable to 
>determine our local host!
>       at
>org.apache.hadoop.hive.metastore.LockRequestBuilder.build(LockRequestBu
>ild
>er.java:56)
>       at
>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManage
>r.j
>ava:227)
>       at
>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManage
>r.j
>ava:92)
>       at
>org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:1047)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1244)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1118)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:314)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:412)
>       at
>org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:428)
>       at
>org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:717)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>62)
>       at
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>rIm
>pl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:497)
>       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
>
>-----Original Message-----
>From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of 
>Gopal Vijayaraghavan
>Sent: čtvrtek 12. ledna 2017 0:20
>To: user@hive.apache.org
>Subject: Re: Vectorised Queries in Hive
>
>
>
>> I have also noticed that this execution mode is only applicable to 
>>single predicate search. It does not work with multiple predicates 
>>searches. Can someone confirms this please?
>
>Can you explain what you mean?
>
>Vectorization supports multiple & nested AND+OR predicates - with some 
>extra SIMD efficiencies in place for constants or repeated values.
>
>Cheers,
>Gopal
>
>

Mime
View raw message