Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 83A53CE4A for ; Sat, 26 May 2012 06:49:32 +0000 (UTC) Received: (qmail 23053 invoked by uid 500); 26 May 2012 06:49:30 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 22679 invoked by uid 500); 26 May 2012 06:49:28 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 22630 invoked by uid 500); 26 May 2012 06:49:26 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 22598 invoked by uid 99); 26 May 2012 06:49:24 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 May 2012 06:49:24 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id AFD55141822 for ; Sat, 26 May 2012 06:49:23 +0000 (UTC) Date: Sat, 26 May 2012 06:49:23 +0000 (UTC) From: "Hudson (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: <1985154174.5363.1338014963722.JavaMail.jiratomcat@issues-vm> In-Reply-To: <434513033.32708.1313140107292.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HIVE-2372) java.io.IOException: error=7, Argument list too long MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283924#comment-13283924 ] Hudson commented on HIVE-2372: ------------------------------ Integrated in Hive-trunk-h0.21 #1450 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1450/]) HIVE-2372 Argument list too long when streaming (Sergey Tryuber via egc) (Revision 1342841) Result = FAILURE ecapriolo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1342841 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java > java.io.IOException: error=7, Argument list too long > ---------------------------------------------------- > > Key: HIVE-2372 > URL: https://issues.apache.org/jira/browse/HIVE-2372 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Sergey Tryuber > Priority: Critical > Fix For: 0.10.0 > > Attachments: HIVE-2372.1.patch.txt, HIVE-2372.2.patch.txt > > > I execute a huge query on a table with a lot of 2-level partitions. There is a perl reducer in my query. Maps worked ok, but every reducer fails with the following exception: > 2011-08-11 04:58:29,865 INFO org.apache.hadoop.hive.ql.exec.ScriptOperator: Executing [/usr/bin/perl, , ] > 2011-08-11 04:58:29,866 INFO org.apache.hadoop.hive.ql.exec.ScriptOperator: tablename=null > 2011-08-11 04:58:29,866 INFO org.apache.hadoop.hive.ql.exec.ScriptOperator: partname=null > 2011-08-11 04:58:29,866 INFO org.apache.hadoop.hive.ql.exec.ScriptOperator: alias=null > 2011-08-11 04:58:29,935 FATAL ExecReducer: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":129390185139228,"reducesinkkey1":"00008AF10000000063CA6F"},"value":{"_col0":"00008AF10000000063CA6F","_col1":"2011-07-27 22:48:52","_col2":129390185139228,"_col3":2006,"_col4":4100,"_col5":"10017388=6","_col6":1063,"_col7":"NULL","_col8":"address.com","_col9":"NULL","_col10":"NULL"},"alias":0} > at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256) > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot initialize ScriptOperator > at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:320) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) > at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) > at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) > ... 7 more > Caused by: java.io.IOException: Cannot run program "/usr/bin/perl": java.io.IOException: error=7, Argument list too long > at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) > at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:279) > ... 15 more > Caused by: java.io.IOException: java.io.IOException: error=7, Argument list too long > at java.lang.UNIXProcess.(UNIXProcess.java:148) > at java.lang.ProcessImpl.start(ProcessImpl.java:65) > at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) > ... 16 more > It seems to me, I found the cause. ScriptOperator.java puts a lot of configs as environment variables to the child reduce process. One of variables is mapred.input.dir, which in my case more than 150KB. There are a huge amount of input directories in this variable. In short, the problem is that Linux (up to 2.6.23 kernel version) limits summary size of environment variables for child processes to 132KB. This problem could be solved by upgrading the kernel. But strings limitations still be 132KB per string in environment variable. So such huge variable doesn't work even on my home computer (2.6.32). You can read more information on (http://www.kernel.org/doc/man-pages/online/pages/man2/execve.2.html). > For now all our work has been stopped because of this problem and I can't find the solution. The only solution, which seems to me more reasonable is to get rid of this variable in reducers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira