Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2DF8D10F57 for ; Sun, 25 Oct 2015 02:59:28 +0000 (UTC) Received: (qmail 1615 invoked by uid 500); 25 Oct 2015 02:59:28 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 1594 invoked by uid 500); 25 Oct 2015 02:59:28 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 1578 invoked by uid 99); 25 Oct 2015 02:59:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 25 Oct 2015 02:59:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E1AAA2C1F65 for ; Sun, 25 Oct 2015 02:59:27 +0000 (UTC) Date: Sun, 25 Oct 2015 02:59:27 +0000 (UTC) From: "Lefty Leverenz (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-11540) Too many delta files during Compaction - OOM MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-11540?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-11540: ---------------------------------- Labels: TODOC2.0 (was: ) > Too many delta files during Compaction - OOM > -------------------------------------------- > > Key: HIVE-11540 > URL: https://issues.apache.org/jira/browse/HIVE-11540 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.0.0 > Reporter: Nivin Mathew > Assignee: Eugene Koifman > Labels: TODOC2.0 > Attachments: HIVE-11540.3.patch, HIVE-11540.4.patch, HIVE-11540.6= .patch, HIVE-11540.patch > > > Hello, > I am streaming weblogs to Kafka and then to Flume 1.6 using a Hive sink, = with an average of 20 million records a day. I have 5 compactors running at= various times (30m/5m/5s), no matter what time I give, the compactors seem= to run out of memory cleaning up a couple thousand delta files and ultimat= ely falls behind compacting/cleaning delta files. Any suggestions on what I= can do to improve performance? Or can Hive streaming not handle this kind = of load? > I used this post as reference: http://henning.kropponline.de/2015/05/19/h= ivesink-for-flume/ > {noformat} > 2015-08-12 15:05:01,197 FATAL [main] org.apache.hadoop.mapred.YarnChild: = Error running child : java.lang.OutOfMemoryError: Direct buffer memory > Max block location exceeded for split: CompactorInputSplit{base: hdfs://D= ev01HWNameService/user/hive/warehouse/weblogs.db/dt=3D15-08-12/base_1056406= , bucket: 0, length: 6493042, deltas: [delta_1056407_1056408, delta_1056409= _1056410, delta_1056411_1056412, delta_1056413_1056414, delta_1056415_10564= 16, delta_1056417_1056418,=E2=80=A6 > , delta_1074039_1074040, delta_1074041_1074042, delta_1074043_1074044, de= lta_1074045_1074046, delta_1074047_1074048, delta_1074049_1074050, delta_10= 74051_1074052]} splitsize: 8772 maxsize: 10 > 2015-08-12 15:34:25,271 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.JobSubmitter (JobSubmitter.java:submitJobInternal(198)) - number of spli= ts:3 > 2015-08-12 15:34:25,367 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.JobSubmitter (JobSubmitter.java:printTokens(287)) - Submitting tokens fo= r job: job_1439397150426_0068 > 2015-08-12 15:34:25,603 INFO [upladevhwd04v.researchnow.com-18]: impl.Ya= rnClientImpl (YarnClientImpl.java:submitApplication(274)) - Submitted appli= cation application_1439397150426_0068 > 2015-08-12 15:34:25,610 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:submit(1294)) - The url to track the job: http://upladevhw= d02v.researchnow.com:8088/proxy/application_1439397150426_0068/ > 2015-08-12 15:34:25,611 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1339)) - Running job: job_1439397150426= _0068 > 2015-08-12 15:34:30,170 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:34:33,756 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1360)) - Job job_1439397150426_0068 run= ning in uber mode : false > 2015-08-12 15:34:33,757 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0% > 2015-08-12 15:34:35,147 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:34:40,155 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:34:45,184 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:34:50,201 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:34:55,256 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:00,205 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:02,975 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1367)) - map 33% reduce 0% > 2015-08-12 15:35:02,982 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000000_0, Status : FAILED > 2015-08-12 15:35:03,000 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000001_0, Status : FAILED > 2015-08-12 15:35:04,008 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0% > 2015-08-12 15:35:05,132 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:10,206 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:15,228 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:20,207 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:25,148 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:28,154 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000000_1, Status : FAILED > 2015-08-12 15:35:29,161 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000001_1, Status : FAILED > 2015-08-12 15:35:30,142 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:35,140 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:40,170 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:45,153 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:50,150 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:35:52,268 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000000_2, Status : FAILED > 2015-08-12 15:35:53,274 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:printTaskEvents(1406)) - Task Id : attempt_1439397150426_0= 068_m_000001_2, Status : FAILED > 2015-08-12 15:35:55,149 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:36:00,160 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:36:05,145 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:36:10,155 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:36:15,158 INFO [Thread-7]: compactor.Initiator (Initiator.= java:run(88)) - Checking to see if we should compact weblogs.vop_hs.dt=3D15= -08-12 > 2015-08-12 15:36:17,397 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1367)) - map 100% reduce 0% > 2015-08-12 15:36:18,409 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1380)) - Job job_1439397150426_0068 fai= led with state FAILED due to: Task failed task_1439397150426_0068_m_000000 > Job failed as tasks failed. failedMaps:1 failedReduces:0 > 2015-08-12 15:36:18,443 INFO [upladevhwd04v.researchnow.com-18]: mapredu= ce.Job (Job.java:monitorAndPrintJob(1385)) - Counters: 10 > =09Job Counters=20 > =09=09Failed map tasks=3D7 > =09=09Killed map tasks=3D1 > =09=09Launched map tasks=3D8 > =09=09Other local map tasks=3D6 > =09=09Data-local map tasks=3D2 > =09=09Total time spent by all maps in occupied slots (ms)=3D191960 > =09=09Total time spent by all reduces in occupied slots (ms)=3D0 > =09=09Total time spent by all map tasks (ms)=3D191960 > =09=09Total vcore-seconds taken by all map tasks=3D191960 > =09=09Total megabyte-seconds taken by all map tasks=3D884551680 > 2015-08-12 15:36:18,443 ERROR [upladevhwd04v.researchnow.com-18]: compact= or.Worker (Worker.java:run(176)) - Caught exception while trying to compact= weblogs.vop_hs.dt=3D15-08-12. Marking clean to avoid repeated failures, j= ava.io.IOException: Job failed! > =09at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) > =09at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR= .java:186) > =09at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:16= 9) > =09at java.security.AccessController.doPrivileged(Native Method) > =09at javax.security.auth.Subject.doAs(Subject.java:415) > =09at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor= mation.java:1657) > =09at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > 2015-08-12 15:36:18,444 ERROR [upladevhwd04v.researchnow.com-18]: txn.Com= pactionTxnHandler (CompactionTxnHandler.java:markCleaned(327)) - Expected t= o remove at least one row from completed_txn_components when marking compac= tion entry as clean! > ^C > {noformat} > [ngmathew@upladevhwd04v ~]$ tail -f /var/log/hive/hivemetastore.log > 2015-08-12 15:36:18,443 ERROR [upladevhwd04v.researchnow.com-18]: compact= or.Worker (Worker.java:run(176)) - Caught exception while trying to compact= weblogs.vop_hs.dt=3D15-08-12. Marking clean to avoid repeated failures, j= ava.io.IOException: Job failed! > =09at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) > =09at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR= .java:186) > =09at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:16= 9) > =09at java.security.AccessController.doPrivileged(Native Method) > =09at javax.security.auth.Subject.doAs(Subject.java:415) > =09at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor= mation.java:1657) > =09at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166) > Settings: > hive.txn.manager =3D org.apache.hadoop.hive.ql.lockmgr.DbTxnManager > hive.compactor.initiator.on =3D true > hive.compactor.worker.threads =3D 5 > Table stored as ORC > hive.vectorized.execution.enabled =3D false > hive.input.format =3D org.apache.hadoop.hive.ql.io.HiveInputFormat -- This message was sent by Atlassian JIRA (v6.3.4#6332)