hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1326) RowContainer uses hard-coded '/tmp/' path for temporary files
Date Mon, 26 Apr 2010 19:08:36 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861054#action_12861054

Ning Zhang commented on HIVE-1326:

The general idea of why we put temp files and directories under /tmp is two fold:
 1) it allows easy clean up in case of any unexpected interruptions to the JVM. In the regular
exits, these temp files/directories should be removed by JVM by deleteOnExit(), but if the
JVM was killed/interrupted unexpected, these temp files could eventually take up all disk
spaces and make the whole cluster hard to recover (imagine finding these temp files in thousands
of machines). By putting it in /tmp, all these will be automatically removed whenever the
machine is restarted, or cleaned up manually by just removing all files in /tmp. That's much
easier in terms of Hadoop administration. 
 2) /tmp seems to be a universally writable mounted point on all unix-based systems. So it
should be a safe place to put the temporary files. But I agree that it would a good idea to
introduce a new Hive parameter to let the user choose a customizable tmp directory, which
should address Edward's point. 
I think we use File.createTempFile() previously and Yongqiang added another directory structure
(parentFile). I think we can use File.createTempFile to create this directory as well (Yongqiang,
please correct me if I'm wrong), there there are several questions about the patch:

1) File.createTempFile() will by default create temp files in java.io.tmpdir (/tmp or /var/tmp)
according to JDK. I suspect most of the case JVM on Linux will use /tmp as the java.io.tmpdir,
so in that caes your change won't achieve your purpose in most cases. 
2) the random number generator is not needed if you can use File.createTempFile to create
parentFile, because File.createTempFile guarantees its uniqueness for this JVM. The only case
you need to prevent is the collision between different JVMs. This is why there is a check
for parentFile.mkdir(). It actually acts like a write lock to the directory so that other
JVM won't overwrite it. So parentFile.delete() should not be added because it may delete the
directory created by another JVM who is still running.

> RowContainer uses hard-coded '/tmp/' path for temporary files
> -------------------------------------------------------------
>                 Key: HIVE-1326
>                 URL: https://issues.apache.org/jira/browse/HIVE-1326
>             Project: Hadoop Hive
>          Issue Type: Bug
>         Environment: Hadoop 0.19.2 with Hive trunk.  We're using FreeBSD 7.0, but that
doesn't seem relevant.
>            Reporter: Michael Klatt
>         Attachments: rowcontainer.patch
> In our production hadoop environment, the "/tmp/" is actually pretty small, and we encountered
a problem when a query used the RowContainer class and filled up the /tmp/ partition.  I tracked
down the cause to the RowContainer class putting temporary files in the '/tmp/' path instead
of using the configured Hadoop temporary path.  I've attached a patch to fix this.
> Here's the traceback:
> 2010-04-25 12:05:05,120 INFO org.apache.hadoop.hive.ql.exec.persistence.RowContainer:
RowContainer created temp file /tmp/hive-rowcontainer-1244151903/RowContainer7816.tmp
> 2010-04-25 12:05:06,326 INFO ExecReducer: ExecReducer: processing 10000000 rows: used
memory = 385520312
> 2010-04-25 12:05:08,513 INFO ExecReducer: ExecReducer: processing 11000000 rows: used
memory = 341780472
> 2010-04-25 12:05:10,697 INFO ExecReducer: ExecReducer: processing 12000000 rows: used
memory = 301446768
> 2010-04-25 12:05:12,837 INFO ExecReducer: ExecReducer: processing 13000000 rows: used
memory = 399208768
> 2010-04-25 12:05:15,085 INFO ExecReducer: ExecReducer: processing 14000000 rows: used
memory = 364507216
> 2010-04-25 12:05:17,260 INFO ExecReducer: ExecReducer: processing 15000000 rows: used
memory = 332907280
> 2010-04-25 12:05:19,580 INFO ExecReducer: ExecReducer: processing 16000000 rows: used
memory = 298774096
> 2010-04-25 12:05:21,629 INFO ExecReducer: ExecReducer: processing 17000000 rows: used
memory = 396505408
> 2010-04-25 12:05:23,830 INFO ExecReducer: ExecReducer: processing 18000000 rows: used
memory = 362477288
> 2010-04-25 12:05:25,914 INFO ExecReducer: ExecReducer: processing 19000000 rows: used
memory = 327229744
> 2010-04-25 12:05:27,978 INFO ExecReducer: ExecReducer: processing 20000000 rows: used
memory = 296051904
> 2010-04-25 12:05:28,155 FATAL ExecReducer: org.apache.hadoop.fs.FSError: java.io.IOException:
No space left on device
> 	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:199)
> 	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> 	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
> 	at java.io.DataOutputStream.write(DataOutputStream.java:90)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.writeChunk(ChecksumFileSystem.java:346)
> 	at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
> 	at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
> 	at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
> 	at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
> 	at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
> 	at java.io.DataOutputStream.write(DataOutputStream.java:90)
> 	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1013)
> 	at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:977)
> 	at org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat$1.write(HiveSequenceFileOutputFormat.java:70)
> 	at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.spillBlock(RowContainer.java:343)
> 	at org.apache.hadoop.hive.ql.exec.persistence.RowContainer.add(RowContainer.java:163)
> 	at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:118)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:456)
> 	at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:244)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:158)
> Caused by: java.io.IOException: No space left on device
> 	at java.io.FileOutputStream.writeBytes(Native Method)
> 	at java.io.FileOutputStream.write(FileOutputStream.java:260)
> 	at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:197)
> 	... 22 more

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message