hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Perroud <ben...@noisette.ch>
Subject DFSOutputStream.Packet retention even if close() called when IOE encountered
Date Wed, 05 Sep 2012 07:28:57 GMT
Hi All,

I experience some memory retention while copying data into HDFS when a
IOExeption is thrown.

My use case is the following: I have multiple threads sharing a
FileSystem object, all uploading files. At some point quota is
exceeded in one thread and I get a DSQuotaExceededException (subclass
of IOException). In both regular case and when such exception is
thrown, I'm closing the DFSOutputStream.
But for DFSOutputStream that encountered a IOException, the last
Packet is kept in memory until the FileSystem is closed. Which I
usually don't close really often.

So my questions:

- Is this the expected behavior and need I to deal with ?
- Is there a way to close properly a DFSOutputStream (and freeing all
the retained memory) without closing the FileSystem ?
- Is the usage of one shared FileSystem in several threads recommended ?

Attached is a simple test reproducing the behavior: MiniDFSCluster is
launched, a deadly small quota is set to have IOException thrown.
Random content is generated and uploaded to hdfs. FileSystem is not
closed, thus memory is growing till an OOM is thrown (don't blame me
for the @Test(expected = OutOfMemoryError.class) :)). Tested on Hadoop

Thanks in advance for your answers, pointers and advises.


  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message