hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matouk IFTISSEN <matouk.iftis...@ysance.com>
Subject Re: Hive 0.13 map 100 % reduce 100% and the reduce decrise to 75 % ( in join or lag function)
Date Wed, 25 Jun 2014 22:48:39 GMT
No task failed in log I suspect the skewed join problem (skewed table using
lag fonction ).
How can I avoid this ( skewed data)?
Le 26 juin 2014 00:40, "Stéphane Verlet" <kaweahsolutions@gmail.com> a
écrit :

> if reduce is decreasing is probably mean it failed.
> It typically retries and goes back up
>
>
> On Mon, Jun 23, 2014 at 3:15 AM, Matouk IFTISSEN <
> matouk.iftissen@ysance.com> wrote:
>
>>
>> My HDFS space is so big I dont thiks this isthe cause of the problem.
>> I  will test to increase java heap memory in hive-env.sh file
>>
>> 2014-06-23 11:06 GMT+02:00 Nagarjuna Vissarapu <nagarjuna.viss@gmail.com>
>> :
>>
>>> Can you please check your hdfs space once? If it is fine please increase
>>> java heap memory in hive-env.sh file
>>>
>>>
>>> On Mon, Jun 23, 2014 at 2:00 AM, Dima Machlin <Dima.Machlin@pursway.com>
>>> wrote:
>>>
>>>>  I don’t see how this is “same” or even remotely related to my issue.
>>>>
>>>> It would be better for you to send it with a different and informative
>>>> subjects on a separate mail.
>>>>
>>>>
>>>>
>>>> *From:* Matouk IFTISSEN [mailto:matouk.iftissen@ysance.com]
>>>> *Sent:* Monday, June 23, 2014 11:49 AM
>>>> *To:* user@hive.apache.org
>>>> *Subject:* Re: Hive 0.12 Mapjoin and MapJoinMemoryExhaustionException
>>>>
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I have as the same problem, but in other manner
>>>>
>>>> the map 100 %
>>>>
>>>> the reduce 100% and then the reduce decrise in 75 % !!
>>>>
>>>> I use a lag function in hive, table  (my_first_table) has 15million
>>>> rows :
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *INSERT INTO TABLE my_table select *, case when nouvelle_tache = '1'
>>>> then 'pas de rejeu' else if (lag(opg_id,1) OVER (PARTITION BY opg_par_id
>>>> order by date_execution) is null,  opg_id, lag(opg_id,1) OVER (PARTITION
BY
>>>> opg_par_id order by date_execution) ) end opg_par_id_1, others_columns from
>>>> my_first_table*
>>>>
>>>> *--- to limit the number of row I have thought that is a memory proble
>>>> but, not because I have a lot of free memory*
>>>>
>>>>
>>>> *where column5 > '37123T0104-10510' and column5 <=  '69191R0025-10162'
>>>> order bycolumn5 ;*
>>>>
>>>>
>>>>
>>>> no error in log , Please healp what is wrong ??
>>>>
>>>>  This the detail for tracker (full log):
>>>>
>>>>  Regards
>>>>
>>>>
>>>>
>>>> 2014-06-23 10:18 GMT+02:00 Dima Machlin <Dima.Machlin@pursway.com>:
>>>>
>>>>  Hello,
>>>>
>>>> We are running Hive 0.12 and using the hive.auto.convert.join feature
>>>> when :
>>>>
>>>> hive.auto.convert.join.noconditionaltask.size = 50000000
>>>>
>>>> hive.mapjoin.followby.gby.localtask.max.memory.usage = 0.7
>>>>
>>>>
>>>>
>>>> The query is a mapjoin with a group by afterwards like so :
>>>>
>>>>
>>>>
>>>> select id,x,max(y)
>>>>
>>>> from (
>>>>
>>>> select t1.id,t1.x,t2.y from  tbl1  join tbl2 on (t1.id=t2.id)
>>>>
>>>>             ) z
>>>>
>>>> group by id,x;
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> While executing a join to a table that has ~3m rows we are failing on :
>>>>
>>>>
>>>>
>>>> org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException:
>>>> 2014-06-10 04:42:21    Processing rows:        2500000 Hashtable size:
>>>> 2499999 Memory usage:704765184        percentage:     0.701
>>>>
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
>>>>
>>>>
>>>>
>>>> This is understood as we pass the 70% limit.
>>>>
>>>> But, the table only takes 35mb in the HDFS and somehow reading it to
>>>> the hash table increases it size drastically when in the end it fails after
>>>> reaching ~700mb.
>>>>
>>>>
>>>>
>>>> So this is the first question – why does it take so much space in
>>>> memory?
>>>>
>>>>
>>>>
>>>> Later, i tried to increase
>>>> hive.mapjoin.followby.gby.localtask.max.memory.usage to allow the mapjoin
>>>> to finish. By doing so i got another problem.
>>>>
>>>> The table is in fact loaded to memory as seen here :
>>>>
>>>>
>>>>
>>>> Processing rows:        2900000 Hashtable size: 2899999 Memory usage:
>>>> 818590784       percentage:     0.815
>>>>
>>>> INFO exec.HashTableSinkOperator: 2014-05-28 12:16:42  Processing
>>>> rows:        2900000 Hashtable size: 2899999 Memory usage:
>>>> 818590784       percentage:   0.815
>>>>
>>>> INFO exec.TableScanOperator: 0 finished. closing...
>>>>
>>>> INFO exec.TableScanOperator: 0 forwarded 2946773 rows
>>>>
>>>> INFO exec.HashTableSinkOperator: 1 finished. closing...
>>>>
>>>> INFO exec.HashTableSinkOperator: Temp URI for side table:
>>>> file:/tmp/hadoop/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-local-10004/HashTable-Stage-2
>>>>
>>>> Dump the side-table into file:
>>>> file:/tmp/hadoop/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-local-10004/HashTable-Stage-2/MapJoin-mapfile691--.hashtable
>>>>
>>>> INFO exec.HashTableSinkOperator: 2014-05-28 12:16:42  Dump the
>>>> side-table into file:
>>>> file:/tmp/hadoop/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-local-10004/HashTable-Stage-2/MapJoin-mapfile691--.hashtable
>>>>
>>>> Upload 1 File to:
>>>> file:/tmp/hadoop/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-local-10004/HashTable-Stage-2/MapJoin-mapfile691--.hashtable
>>>>
>>>> INFO exec.HashTableSinkOperator: 2014-05-28 12:16:45  Upload 1 File to:
>>>> file:/tmp/hadoop/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-local-10004/HashTable-Stage-2/MapJoin-mapfile691--.hashtable
>>>>
>>>> INFO exec.HashTableSinkOperator: 1 forwarded 0 rows
>>>>
>>>> INFO exec.HashTableSinkOperator: 1 forwarded 0 rows
>>>>
>>>> INFO exec.TableScanOperator: 0 Close done
>>>>
>>>> End of local task; Time Taken: 10.745 sec.
>>>>
>>>>
>>>>
>>>> *But then, the join stage hangs for long time and fails on OOM.*
>>>>
>>>>
>>>>
>>>> From the logs, i can see that it hangs on this line :
>>>>
>>>>
>>>>
>>>> 2014-05-28 12:16:58,229 INFO
>>>> org.apache.hadoop.hive.ql.exec.MapJoinOperator: ******* Load from HashTable
>>>> File: input :
>>>> maprfs:/user/hadoop/tmp/hive/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-mr-10003/000000_0
>>>>
>>>> 2014-05-28 12:16:58,230 INFO
>>>> org.apache.hadoop.hive.ql.exec.MapJoinOperator:           Load back 1
>>>> hashtable file from tmp file
>>>> uri:/tmp/mapr-hadoop/mapred/local/taskTracker/hadoop/distcache/-479500712399318067_367753608_1109273133/maprfs/user/hadoop/tmp/hive/hive_2014-05-28_12-16-21_239_3089817264132856114-94/-mr-10005/HashTable-Stage-2/Stage-2.tar.gz/MapJoin-mapfile691--.hashtable
>>>>
>>>> 2014-05-28 12:18:31,302 INFO
>>>> org.apache.hadoop.hive.ql.exec.MapOperator: 6 finished. closing...
>>>>
>>>>
>>>>
>>>> It hangs on “Load back 1 hashtable file from tmp” for 1:33 minutes and
>>>> then we get the exception :
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2014-05-28 12:18:31,304 WARN org.apache.hadoop.ipc.Client: Unexpected
>>>> error reading responses on connection Thread[IPC Client (47) connection to
/
>>>> 127.0.0.1:48520 from job_201405191528_9910,5,main]
>>>>
>>>> java.lang.OutOfMemoryError: Java heap space
>>>>
>>>>                 at
>>>> java.lang.StringBuffer.toString(StringBuffer.java:585)
>>>>
>>>>                 at org.apache.hadoop.io.UTF8.readString(UTF8.java:209)
>>>>
>>>>                 at
>>>> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:179)
>>>>
>>>>                 at
>>>> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:829)
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:725)
>>>>
>>>> 2014-05-28 12:18:31,306 INFO org.apache.hadoop.mapred.Task:
>>>> Communication exception: java.io.IOException: Call to /127.0.0.1:48520
>>>> failed on local exception: java.io.IOException: Error reading responses
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
>>>>
>>>>                 at org.apache.hadoop.ipc.Client.call(Client.java:1098)
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:275)
>>>>
>>>>                 at $Proxy0.ping(Unknown Source)
>>>>
>>>>                 at
>>>> org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:680)
>>>>
>>>>                 at java.lang.Thread.run(Thread.java:662)
>>>>
>>>> Caused by: java.io.IOException: Error reading responses
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:732)
>>>>
>>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>>                 at
>>>> java.lang.StringBuffer.toString(StringBuffer.java:585)
>>>>
>>>>                 at org.apache.hadoop.io.UTF8.readString(UTF8.java:209)
>>>>
>>>>                 at
>>>> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:179)
>>>>
>>>>                 at
>>>> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:829)
>>>>
>>>>                 at
>>>> org.apache.hadoop.ipc.Client$Connection.run(Client.java:725)
>>>>
>>>>
>>>>
>>>> Port 127.0.0.1:48520 is the tasktracker.
>>>>
>>>>
>>>>
>>>> The file the local stage uploaded “MapJoin-mapfile691--.hashtable”  is
>>>> only 87MB
>>>>
>>>> The zip in which its located “Stage-2.tar.gz” is only 23MB
>>>>
>>>>
>>>>
>>>> *What’s going on here? Why can the join continue successfully? *
>>>>
>>>>
>>>>
>>>> Last, i tried removing the group by from the query. After doing so, the
>>>> query ends with no problem (setting
>>>> hive.mapjoin.followby.gby.localtask.max.memory.usage more than 0.82)
>>>>
>>>> No hangs or anything.
>>>>
>>>>
>>>>
>>>> *How can the group by effect the “Load back 1 hashtable file from tmp”
>>>> step in any way?*
>>>>
>>>>
>>>>
>>>> Thanks in advance for any answers/comments.
>>>>
>>>> -----------------------------------------------
>>>>
>>>> [image: cid:image001.jpg@01CE92B5.CB034C90]
>>>>
>>>> *Dima Machlin, **Big Data Architect*
>>>>
>>>> 15 Abba Eban Blvd. PO Box 4125, Herzliya 46140 IL
>>>>
>>>> P: +972-9-9518147 |M: +972-54-5671337|F: +972-9-9584736
>>>>
>>>> *Pursway.com* <http://www.pursway.com/>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *Matouk IFTISSEN | Consultant BI & Big Data*
>>>> * [image: http://www.ysance.com] *
>>>> 24 rue du sentier - 75002 Paris - www.ysance.com
>>>> Fax : +33 1 73 72 97 26
>>>> *Ysance sur* :*Twitter* <http://twitter.com/ysance>* | Facebook
>>>> <https://www.facebook.com/pages/Ysance/131036788697> | Google+
>>>> <https://plus.google.com/u/0/b/115710923959357341736/115710923959357341736/posts>
| LinkedIn
>>>> <http://www.linkedin.com/company/ysance> | Newsletter
>>>> <http://www.ysance.com/nous-contacter.html>*
>>>> *Nos autres sites* : *ys4you* <http://wwww.ys4you.com/>* | labdecisionnel
>>>> <http://www.labdecisionnel.com/> | decrypt <http://decrypt.ysance.com/>*
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ************************************************************************************
>>>> This footnote confirms that this email message has been scanned by
>>>> PineApp Mail-SeCure for the presence of malicious code, vandals &
>>>> computer viruses.
>>>>
>>>> ************************************************************************************
>>>>
>>>
>>>
>>>
>>> --
>>> With Thanks & Regards
>>> Nagarjuna Vissarapu
>>> 9052179339
>>>
>>>
>>
>>
>> --
>>
>> *Matouk IFTISSEN | Consultant BI & Big Data [image:
>> http://www.ysance.com] *
>> 24 rue du sentier - 75002 Paris - www.ysance.com <http://www.ysance.com/>
>> Fax : +33 1 73 72 97 26
>> *Ysance sur* :*Twitter* <http://twitter.com/ysance>* | Facebook
>> <https://www.facebook.com/pages/Ysance/131036788697> | Google+
>> <https://plus.google.com/u/0/b/115710923959357341736/115710923959357341736/posts>
| LinkedIn
>> <http://www.linkedin.com/company/ysance> | Newsletter
>> <http://www.ysance.com/nous-contacter.html>*
>> *Nos autres sites* : *ys4you* <http://wwww.ys4you.com/>* | labdecisionnel
>> <http://www.labdecisionnel.com/> | decrypt <http://decrypt.ysance.com/>*
>>
>
>

Mime
View raw message