Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D30ACE84B for ; Fri, 1 Mar 2013 22:19:13 +0000 (UTC) Received: (qmail 5903 invoked by uid 500); 1 Mar 2013 22:19:13 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 5833 invoked by uid 500); 1 Mar 2013 22:19:13 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 5712 invoked by uid 500); 1 Mar 2013 22:19:12 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 5693 invoked by uid 99); 1 Mar 2013 22:19:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 22:19:12 +0000 Date: Fri, 1 Mar 2013 22:19:12 +0000 (UTC) From: "Gopal V (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591008#comment-13591008 ] Gopal V commented on HIVE-4104: ------------------------------- Before {code} 2013-03-01 05:15:13 Dump the hashtable into file: file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable 2013-03-01 05:15:27 Upload 1 File to: file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable File size: 18426794 2013-03-01 05:15:27 End of local task; Time Taken: 22.314 sec. {code} After {code} 2013-03-01 05:15:53 Dump the hashtable into file: file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable 2013-03-01 05:15:54 Upload 1 File to: file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable File size: 18426794 2013-03-01 05:15:54 End of local task; Time Taken: 9.601 sec. {code} Savings are found on the map-side read as well. Before {code} Job 0: Map: 4 Cumulative CPU: 64.79 sec HDFS Read: 300156 HDFS Write: 1682 SUCCESS Total MapReduce CPU Time Spent: 1 minutes 4 seconds 790 msec Time taken: 56.385 seconds, Fetched: 100 row(s) {code} After {code} Job 0: Map: 4 Cumulative CPU: 26.95 sec HDFS Read: 300156 HDFS Write: 1682 SUCCESS Total MapReduce CPU Time Spent: 26 seconds 950 msec Time taken: 38.173 seconds, Fetched: 100 row(s) {code} > Hive localtask does not buffer disk-writes or reads > --------------------------------------------------- > > Key: HIVE-4104 > URL: https://issues.apache.org/jira/browse/HIVE-4104 > Project: Hive > Issue Type: Bug > Reporter: Gopal V > Assignee: Gopal V > Priority: Minor > > Hive's HashMapWrapper does not use any buffering in its File I/O, but operates sequentially for writes & reads. > The strace logs show clearly that > {code} > 9495 write(222, "x", 1) = 1 > 9495 write(222, "sq\0~\0\5", 6) = 6 > 9495 write(222, "w\25", 2) = 2 > 9495 write(222, "\0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S", 21) = 21 > 9495 write(222, "x", 1) = 1 > 9495 write(222, "sq\0~\0\2", 6) = 6 > 9495 write(222, "w\t", 2) = 2 > 9495 write(222, "\0\0\0\5\1\215\r\325v", 9) = 9 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira