hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads
Date Fri, 01 Mar 2013 22:19:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13591008#comment-13591008
] 

Gopal V commented on HIVE-4104:
-------------------------------

Before

{code}
2013-03-01 05:15:13	Dump the hashtable into file: file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable
2013-03-01 05:15:27	Upload 1 File to: file:/tmp/root/hive_2013-03-01_17-14-59_468_442960319525994949/-local-10002/HashTable-Stage-1/MapJoin-customer_demographics-01--.hashtable
File size: 18426794
2013-03-01 05:15:27	End of local task; Time Taken: 22.314 sec.
{code}

After

{code}
2013-03-01 05:15:53	Dump the hashtable into file: file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
2013-03-01 05:15:54	Upload 1 File to: file:/tmp/root/hive_2013-03-01_17-15-39_668_1531738824783900468/-local-10002/HashTable-Stage-1/MapJoin-demographics-01--.hashtable
File size: 18426794
2013-03-01 05:15:54	End of local task; Time Taken: 9.601 sec.
{code}

Savings are found on the map-side read as well.

Before

{code}
Job 0: Map: 4   Cumulative CPU: 64.79 sec   HDFS Read: 300156 HDFS Write: 1682 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 4 seconds 790 msec

Time taken: 56.385 seconds, Fetched: 100 row(s)
{code}

After

{code}
Job 0: Map: 4   Cumulative CPU: 26.95 sec   HDFS Read: 300156 HDFS Write: 1682 SUCCESS
Total MapReduce CPU Time Spent: 26 seconds 950 msec

Time taken: 38.173 seconds, Fetched: 100 row(s)
{code}
                
> Hive localtask does not buffer disk-writes or reads
> ---------------------------------------------------
>
>                 Key: HIVE-4104
>                 URL: https://issues.apache.org/jira/browse/HIVE-4104
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>
> Hive's HashMapWrapper does not use any buffering in its File I/O, but operates sequentially
for writes & reads.
> The strace logs show clearly that
> {code}
> 9495  write(222, "x", 1)                = 1
> 9495  write(222, "sq\0~\0\5", 6)        = 6
> 9495  write(222, "w\25", 2)             = 2
> 9495  write(222, "\0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S", 21) = 21
> 9495  write(222, "x", 1)                = 1
> 9495  write(222, "sq\0~\0\2", 6)        = 6
> 9495  write(222, "w\t", 2)              = 2
> 9495  write(222, "\0\0\0\5\1\215\r\325v", 9) = 9
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message