hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer
Date Tue, 29 Dec 2015 01:08:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073332#comment-15073332
] 

Kai Zheng commented on HDFS-8562:
---------------------------------

Hi [~cmccabe], I may misunderstood you so to be clear, lets ref. the following codes:
{code}
  public static PassedFileChannel open(FileDescriptor fd,
                                       boolean writable) throws IOException {
    FileChannel channel = null;
    try {
      if (open5Args != null) {
        /**
         * Attempt this private method existing in some JDK:
         * public static FileChannel open(FileDescriptor fd, String path,
         *                  boolean readable, boolean writable, Object parent)
         * Note the path is expected to be nullable.
         */
        channel = (FileChannel) open5Args.invoke(null,
            fd, null, true, writable, null);
      } else if (open4Args != null) {
        /**
         * Attempt this private method existing in some JDK:
         * public static FileChannel open(FileDescriptor fd, boolean readable,
         *                                boolean writable, Object parent)
         */
        channel = (FileChannel) open4Args.invoke(null,
            fd, true, writable, null);
      }
    } catch (IllegalArgumentException e) {
      // Ignore any error due to unexpected JDKs, or JDK versions.
    } catch (InvocationTargetException e) {
      // Ignore
    } catch (IllegalAccessException e) {
      // Ignore
    }

    if (channel != null) {
      return new PassedFileChannel(channel, fd);
    }

    // All errors will fallback to this, as before using FileInputStream.
    FileInputStream fis = new FileInputStream(fd);
    return new PassedFileChannel(fis);
  }
{code}
The try block catches 3 kinds of exceptions like {{IllegalArgumentException}} and when either
of them happens, {{channel}} will be {{null}}. I might not be clear, but what I mean is, the
exceptions particularly {{IllegalArgumentException}} may happen, because in the {{open5Args}}
case, {{path}} is required but we always pass it null. This may work in some JDK versions,
but may be not for others. We bet it be {{nullable}} but we may be incorrect as the method
is a private one subject to implementation change. Sure if you'd insist that we don't worry
about this, I'm OK. You said we do not do any fallback, in somewhere or in the function at
all? I thought at least in the coding form we need to catch these exceptions for reflection
and when some happens we would need the fallback just in case. Would you please help clarify?
Thanks.

By the way, I'm handling some minor issues found here separately in HADOOP-12658 as I'm not
sure this will be in or not.

> HDFS Performance is impacted by FileInputStream Finalizer
> ---------------------------------------------------------
>
>                 Key: HDFS-8562
>                 URL: https://issues.apache.org/jira/browse/HDFS-8562
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.5.0
>         Environment: Impact any application that uses HDFS
>            Reporter: Yanping Wang
>         Attachments: HDFS-8562.002b.patch, HDFS-8562.003a.patch, HDFS-8562.003b.patch,
HDFS-8562.004a.patch, HDFS-8562.01.patch
>
>
> While running HBase using HDFS as datanodes, we noticed excessive high GC pause spikes.
For example with jdk8 update 40 and G1 collector, we saw datanode GC pauses spiked toward
160 milliseconds while they should be around 20 milliseconds. 
> We tracked down to GC logs and found those long GC pauses were devoted to process high
number of final references. 
> For example, this Young GC:
> 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
> 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
> 2715.572: [WeakReference, 0 refs, 0.0000123 secs]
> 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
> 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
> 2715.647: [JNI Weak Reference, 0.0000140 secs]
> [Ref Proc: 122.3 ms]
> [Eden: 910.0M(910.0M)->0.0B(911.0M) Survivors: 11.0M->10.0M Heap: 951.1M(1536.0M)->40.2M(1536.0M)]
> [Times: user=0.47 sys=0.01, real=0.15 secs]
> This young GC took 152.9 milliseconds STW pause, while spent 122.3 milliseconds in Ref
Proc, which processed 8292 FinalReference in 74.8 milliseconds plus some overhead.
> We used JFR and JMAP with Memory Analyzer to track down and found those FinalReference
were all from FileInputStream.  We checked HDFS code and saw the use of the FileInputStream
in datanode:
> https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
> {code}
> 1.	public static MappableBlock load(long length,
> 2.	FileInputStream blockIn, FileInputStream metaIn,
> 3.	String blockFileName) throws IOException {
> 4.	MappableBlock mappableBlock = null;
> 5.	MappedByteBuffer mmap = null;
> 6.	FileChannel blockChannel = null;
> 7.	try {
> 8.	blockChannel = blockIn.getChannel();
> 9.	if (blockChannel == null) {
> 10.	throw new IOException("Block InputStream has no FileChannel.");
> 11.	}
> 12.	mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
> 13.	NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
> 14.	verifyChecksum(length, metaIn, blockChannel, blockFileName);
> 15.	mappableBlock = new MappableBlock(mmap, length);
> 16.	} finally {
> 17.	IOUtils.closeQuietly(blockChannel);
> 18.	if (mappableBlock == null) {
> 19.	if (mmap != null) {
> 20.	NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
> 21.	}
> 22.	}
> 23.	}
> 24.	return mappableBlock;
> 25.	}
> {code}
> We looked up https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html 
and
> http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
and noticed FileInputStream relies on the Finalizer to release its resource. 
> When a class that has a finalizer created, an entry for that class instance is put on
a queue in the JVM so the JVM knows it has a finalizer that needs to be executed.   
> The current issue is: even with programmers do call close() after using FileInputStream,
its finalize() method will still be called. In other words, still get the side effect of the
FinalReference being registered at FileInputStream allocation time, and also reference processing
to reclaim the FinalReference during GC (any GC solution has to deal with this). 
> We can imagine When running industry deployment HDFS, millions of files could be opened
and closed which resulted in a very large number of finalizers being registered and subsequently
being executed.  That could cause very long GC pause times.
> We tried to use Files.newInputStream() to replace FileInputStream, but it was clear we
could not replace FileInputStream in hdfs/server/datanode/fsdataset/impl/MappableBlock.java

> We notified Oracle JVM team of this performance issue that impacting all Big Data applications
using HDFS. We recommended the proper fix in Java SE FileInputStream. Because (1) it is really
nothing wrong to use FileInputStream in above datanode code, (2) as the object with a finalizer
is registered with finalizer list within the JVM at object allocation time, if someone makes
an explicit call to close or free the resources that are to be done in the finalizer, then
the finalizer should be pulled off the internal JVM’s finalizer list. That will release
the JVM from having to treat the object as special because it has a finalizer, i.e. no need
for GC to execute the finalizer as part of Reference Processing.  
> As the java fix involves both JVM code and Java SE code, it might take time for the full
solution to be available in future JDK releases. We would like to file his JIRA to notify
Big Data, HDFS community to aware this issue while using HDFS and while writing code using
FileInputStream 
> One alternative is to use Files.newInputStream() to substitute FileInputStream if it
is possible. File.newInputStream() will give an InputStream and do so in a manner that does
not include a finalizer.
> We welcome HDFS community to discuss this issue and see if there are additional ideas
to solve this problem. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message