hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12404) Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class.
Date Mon, 13 Feb 2017 03:23:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863164#comment-15863164
] 

zhihai xu commented on HADOOP-12404:
------------------------------------

[~anishek] We see this issue in hivesever2 logs, all the queries share a JVM in different
threads for hiveserver2. The finalize statement will also close the InputStream for ZipFile
class so it will also depend on when the garbage collection happens. Normally we see this
issue happened several times per day in hiveserver2 which run thousand of queries per day.
I thought loading jar files should be very quick that is why it happens rarely. After we disable
caching, this issue didn't happen any more. Did you see this issue also? What is your environment?

> Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading
resource from URL in Configuration class.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-12404
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12404
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>            Priority: Minor
>             Fix For: 2.8.0, 3.0.0-alpha1
>
>         Attachments: HADOOP-12404.000.patch
>
>
> Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading
resource from URL in Configuration class.
> Currently {{Configuration#parse}} will call {{url.openStream}} to get the InputStream
for {{DocumentBuilder}} to parse.
> Based on the JDK source code, the calling sequence is 
> url.openStream => [handler.openConnection.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/Handler.java]
=> [new JarURLConnection|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarURLConnection.java#JarURLConnection]
=> JarURLConnection.connect => [factory.get(getJarFileURL(), getUseCaches())|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarFileFactory.java]
=>  [URLJarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/URLJarFile.java#URLJarFile.getJarFile%28java.net.URL%2Csun.net.www.protocol.jar.URLJarFile.URLJarFileCloseController%29]=>[JarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/jar/JarFile.java#JarFile.getInputStream%28java.util.zip.ZipEntry%29]=>ZipFile.getInputStream
> If {{URLConnection#getUseCaches}} is true (by default), URLJarFile will be shared for
the same URL. If the shared URLJarFile is closed by other users, all the InputStream returned
by URLJarFile#getInputStream will be closed based on the [document|http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipFile.html#getInputStream(java.util.zip.ZipEntry)]
> So we saw the following exception in a heavy-load system at rare situation which cause
a hive job failed 
> {code}
> 2014-10-21 23:44:41,856 ERROR org.apache.hadoop.hive.ql.exec.Task: Ended 
> Job = job_1413909398487_3696 with exception 
> 'java.lang.RuntimeException(java.io.IOException: Stream closed)' 
> java.lang.RuntimeException: java.io.IOException: Stream closed 
> at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2484) 
> at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2337) 
> at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2254) 
> at org.apache.hadoop.conf.Configuration.get(Configuration.java:861) 
> at 
> org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2030) 
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479) 
> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469) 
> at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:187) 
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:582) 
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j 
> ava:1614) 
> at 
> org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:580) 
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:598) 
> at 
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe 
> cHelper.java:288) 
> at 
> org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe 
> cHelper.java:547) 
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) 
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) 
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) 
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) 
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) 
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) 
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) 
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:919) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation 
> .java:145) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation. 
> java:69) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.jav 
> a:200) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j 
> ava:1614) 
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java: 
> 502) 
> at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java: 
> 213) 
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1 
> 145) 
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 
> 615) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: java.io.IOException: Stream closed 
> at 
> java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67) 
> at 
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142) 
> at java.io.FilterInputStream.read(FilterInputStream.java:133) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStr 
> eam.read(XMLEntityManager.java:2902) 
> at 
> com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java: 
> 302) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScan 
> ner.java:1753) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntity 
> Scanner.java:1426) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Frag 
> mentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2807) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocu 
> mentScannerImpl.java:606) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNS 
> DocumentScannerImpl.java:117) 
> at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scan 
> Document(XMLDocumentFragmentScannerImpl.java:510) 
> at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co 
> nfiguration.java:848) 
> at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co 
> nfiguration.java:777) 
> at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:1 
> 41) 
> at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:2 
> 43) 
> at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentB 
> uilderImpl.java:347) 
> at 
> javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) 
> at 
> org.apache.hadoop.conf.Configuration.parse(Configuration.java:2325) 
> at 
> org.apache.hadoop.conf.Configuration.parse(Configuration.java:2313) 
> at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2384)
> {code}
> Also we can save a little bit memory, with [JarURLConnection's caches|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarFileFactory.java#JarFileFactory.getCachedJarFile%28java.net.URL%29]
disabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message