hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristopher Kane <kkane.l...@gmail.com>
Subject Re: Hive Join Running Out of Memory
Date Mon, 21 Jul 2014 01:19:49 GMT
Clay,

Keep in mind that setting this to false in the global hive-site.xml will
mean that you will not do any client hash table generating and will miss
out on optimizations for other joins.  You should set this in your query
directly.  Another option is so increase the client side heap to allow for
larger in memory tables.




On Fri, Jul 18, 2014 at 11:12 AM, Clay McDonald <
stuart.mcdonald@bateswhite.com> wrote:

> I changed the hive.auto.convert.join.noconditionaltask = false in the hive
> site and that seemed to do the trick. Thanks!
>
>
> From: Edward Capriolo [mailto:edlinuxguru@gmail.com]
> Sent: Friday, July 18, 2014 10:57 AM
> To: user@hive.apache.org
> Subject: Re: Hive Join Running Out of Memory
>
> I believe that would be the one.
>
> On Fri, Jul 18, 2014 at 10:54 AM, Clay McDonald <
> stuart.mcdonald@bateswhite.com> wrote:
> Thank you. Would it be acceptable to use the following?
>
> SET hive.exec.mode.local.auto=false;
>
>
> From: Edward Capriolo [mailto:edlinuxguru@gmail.com]
> Sent: Friday, July 18, 2014 10:45 AM
> To: user@hive.apache.org
> Subject: Re: Hive Join Running Out of Memory
>
> This is a failed optimization hive is trying to build the lookup table
> locally and then put it in the distributed cache and then to a map join.
> Look through your hive site for the configuration to turn these auto-map
> joins off. Based on your version the variables changed a names /deprecated
> etc so I can not tell you the exact ones.
>
> On Fri, Jul 18, 2014 at 10:35 AM, Clay McDonald <
> stuart.mcdonald@bateswhite.com> wrote:
> Hello everyone. I need some assistance. I have a join that fails with
>  return code 3. The query is;
>
> SELECT B.CARD_NBR AS CNT
> FROM TENDER_TABLE A
> JOIN  LOYALTY_CARDS B
> ON A.CARD_NBR = B.CARD_NBR
> LIMIT 10;
>
> -- Row Counts
> -- LOYALTY_CARDS =   43,876,938
> -- TENDER_TABLE = 1,412,228,333
>
> The query execution output starts with;
>
> 2014-07-18 10:30:17     Starting to launch local task to process map join;
>      maximum memory = 1065484288
>
> The last output is as follows;
>
> 2014-07-18 10:30:44     Processing rows:        3800000 Hashtable size:
> 3799999 Memory usage:   969531248       percentage:     0.91
>
> I ran SET mapred.child.java.opts=-Xmx4G; before the query but that did not
> change the maximum memory. What am I not understanding and how should I
> troubleshoot his issue?
>
>
> hive> SELECT B.CARD_NBR AS CNT
>     > FROM TENDER_TABLE A
>     > JOIN  LOYALTY_CARDS B
>     > ON A.CARD_NBR = B.CARD_NBR
>     > LIMIT 10;
> Query ID = root_20140718103030_df1e7af9-7d66-4ba5-8d73-2d0bf58bb474
> Total jobs = 1
> 14/07/18 10:30:17 WARN conf.Configuration:
> file:/tmp/root/hive_2014-07-18_10-30-15_081_1503496466695602651-1/-local-10006/jobconf.xml:an
> attempt to override final parameter:
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 14/07/18 10:30:17 WARN conf.Configuration:
> file:/tmp/root/hive_2014-07-18_10-30-15_081_1503496466695602651-1/-local-10006/jobconf.xml:an
> attempt to override final parameter:
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> 14/07/18 10:30:17 INFO Configuration.deprecation: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
> 14/07/18 10:30:17 INFO Configuration.deprecation: mapred.min.split.size is
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 14/07/18 10:30:17 INFO Configuration.deprecation:
> mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
> mapreduce.reduce.speculative
> 14/07/18 10:30:17 INFO Configuration.deprecation:
> mapred.min.split.size.per.node is deprecated. Instead, use
> mapreduce.input.fileinputformat.split.minsize.per.node
> 14/07/18 10:30:17 INFO Configuration.deprecation:
> mapred.input.dir.recursive is deprecated. Instead, use
> mapreduce.input.fileinputformat.input.dir.recursive
> 14/07/18 10:30:17 INFO Configuration.deprecation:
> mapred.min.split.size.per.rack is deprecated. Instead, use
> mapreduce.input.fileinputformat.split.minsize.per.rack
> 14/07/18 10:30:17 INFO Configuration.deprecation: mapred.max.split.size is
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 14/07/18 10:30:17 INFO Configuration.deprecation:
> mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use
> mapreduce.job.committer.setup.cleanup.needed
> Execution log at:
> /tmp/root/root_20140718103030_df1e7af9-7d66-4ba5-8d73-2d0bf58bb474.log
> 2014-07-18 10:30:17     Starting to launch local task to process map join;
>      maximum memory = 1065484288
> 2014-07-18 10:30:20     Processing rows:        200000  Hashtable size:
> 199999  Memory usage:   53829960        percentage:     0.051
> 2014-07-18 10:30:21     Processing rows:        300000  Hashtable size:
> 299999  Memory usage:   76926312        percentage:     0.072
> 2014-07-18 10:30:22     Processing rows:        400000  Hashtable size:
> 399999  Memory usage:   105119456       percentage:     0.099
> 2014-07-18 10:30:23     Processing rows:        500000  Hashtable size:
> 499999  Memory usage:   129079592       percentage:     0.121
> 2014-07-18 10:30:24     Processing rows:        600000  Hashtable size:
> 599999  Memory usage:   151469744       percentage:     0.142
> 2014-07-18 10:30:24     Processing rows:        700000  Hashtable size:
> 699999  Memory usage:   174968512       percentage:     0.164
> 2014-07-18 10:30:25     Processing rows:        800000  Hashtable size:
> 799999  Memory usage:   207735176       percentage:     0.195
> 2014-07-18 10:30:25     Processing rows:        900000  Hashtable size:
> 899999  Memory usage:   232306976       percentage:     0.218
> 2014-07-18 10:30:26     Processing rows:        1000000 Hashtable size:
> 999999  Memory usage:   255813784       percentage:     0.24
> 2014-07-18 10:30:27     Processing rows:        1100000 Hashtable size:
> 1099999 Memory usage:   280781448       percentage:     0.264
> 2014-07-18 10:30:27     Processing rows:        1200000 Hashtable size:
> 1199999 Memory usage:   305606024       percentage:     0.287
> 2014-07-18 10:30:28     Processing rows:        1300000 Hashtable size:
> 1299999 Memory usage:   323502504       percentage:     0.304
> 2014-07-18 10:30:28     Processing rows:        1400000 Hashtable size:
> 1399999 Memory usage:   347450792       percentage:     0.326
> 2014-07-18 10:30:29     Processing rows:        1500000 Hashtable size:
> 1499999 Memory usage:   372281800       percentage:     0.349
> 2014-07-18 10:30:30     Processing rows:        1600000 Hashtable size:
> 1599999 Memory usage:   413191040       percentage:     0.388
> 2014-07-18 10:30:30     Processing rows:        1700000 Hashtable size:
> 1699999 Memory usage:   438363432       percentage:     0.411
> 2014-07-18 10:30:31     Processing rows:        1800000 Hashtable size:
> 1799999 Memory usage:   462137696       percentage:     0.434
> 2014-07-18 10:30:32     Processing rows:        1900000 Hashtable size:
> 1899999 Memory usage:   486055520       percentage:     0.456
> 2014-07-18 10:30:32     Processing rows:        2000000 Hashtable size:
> 1999999 Memory usage:   509470528       percentage:     0.478
> 2014-07-18 10:30:33     Processing rows:        2100000 Hashtable size:
> 2099999 Memory usage:   532719808       percentage:     0.50
> 2014-07-18 10:30:34     Processing rows:        2200000 Hashtable size:
> 2199999 Memory usage:   559536600       percentage:     0.525
> 2014-07-18 10:30:34     Processing rows:        2300000 Hashtable size:
> 2299999 Memory usage:   582320848       percentage:     0.547
> 2014-07-18 10:30:34     Processing rows:        2400000 Hashtable size:
> 2399999 Memory usage:   605378000       percentage:     0.568
> 2014-07-18 10:30:34     Processing rows:        2500000 Hashtable size:
> 2499999 Memory usage:   631760096       percentage:     0.593
> 2014-07-18 10:30:37     Processing rows:        2600000 Hashtable size:
> 2599999 Memory usage:   644527288       percentage:     0.605
> 2014-07-18 10:30:37     Processing rows:        2700000 Hashtable size:
> 2699999 Memory usage:   670778416       percentage:     0.63
> 2014-07-18 10:30:37     Processing rows:        2800000 Hashtable size:
> 2799999 Memory usage:   692955384       percentage:     0.65
> 2014-07-18 10:30:37     Processing rows:        2900000 Hashtable size:
> 2899999 Memory usage:   719573912       percentage:     0.675
> 2014-07-18 10:30:37     Processing rows:        3000000 Hashtable size:
> 2999999 Memory usage:   741345744       percentage:     0.696
> 2014-07-18 10:30:37     Processing rows:        3100000 Hashtable size:
> 3099999 Memory usage:   768150432       percentage:     0.721
> 2014-07-18 10:30:40     Processing rows:        3200000 Hashtable size:
> 3199999 Memory usage:   821525128       percentage:     0.771
> 2014-07-18 10:30:40     Processing rows:        3300000 Hashtable size:
> 3299999 Memory usage:   848349472       percentage:     0.796
> 2014-07-18 10:30:41     Processing rows:        3400000 Hashtable size:
> 3399999 Memory usage:   875173824       percentage:     0.821
> 2014-07-18 10:30:41     Processing rows:        3500000 Hashtable size:
> 3499999 Memory usage:   895656992       percentage:     0.841
> 2014-07-18 10:30:41     Processing rows:        3600000 Hashtable size:
> 3599999 Memory usage:   922231528       percentage:     0.866
> 2014-07-18 10:30:44     Processing rows:        3700000 Hashtable size:
> 3699999 Memory usage:   943112616       percentage:     0.885
> 2014-07-18 10:30:44     Processing rows:        3800000 Hashtable size:
> 3799999 Memory usage:   969531248       percentage:     0.91
> Execution failed with exit status: 3
> Obtaining error information
>
> Task failed!
> Task ID:
>   Stage-4
>
> Logs:
>
> /tmp/root/hive.log
> FAILED: Execution Error, return code 3 from
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
>
> Thanks, Clay
>
>

Mime
View raw message