db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-7006) Investigate putting generated classes under the engine module loader
Date Thu, 11 Oct 2018 17:02:00 GMT

    [ https://issues.apache.org/jira/browse/DERBY-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646758#comment-16646758
] 

Rick Hillegas commented on DERBY-7006:
--------------------------------------

----

h2. Summary of Experiments

I have put some effort into trying to get Derby to load the generated classes into the engine
module. On core-libs-dev@openjdk.java.net, I started an email thread titled "generated code
and jigsaw modules". Two suggestions came back:

S1) Rémi Forax recommended that we try loading the generated bytes as follows:

{noformat}
java.lang.invoke.MethodHandles.lookup().defineClass(generatedClassBytes)
{noformat}

S2) Alan Bateman suggested that we study code from the java.xml module: http://hg.openjdk.java.net/jdk/jdk/raw-file/tip/src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/TemplatesImpl.java

I tried both approaches. See the attached patches: derby-7006-01-aa-remiForax.diff and derby-7006-01-ac-alanBateman.diff.

Both approaches solved some important problems:

P1) running simple DDL and queries

P2) running triggers

But neither approach solved the following problem:

P3)  running functions which live inside jar files stored in the database

After posting these results to core-libs-dev@openjdk.java.net, I received more kind advice,
which, nevertheless, did not fix problem P3.

These are my conclusions:

C1) S1 is the simpler, more straightforward solution.

C2) S2 is more complicated to start out with. But I will play around with it more.

Better solutions may occur as JPMS evolves. In the short-term, we could also reduce the attack
surface of the exposed engine modules. See https://issues.apache.org/jira/browse/DERBY-7012

The following sections provide brief descriptions of the attached patches as well as the additional
context provided by my two posts to core-libs-dev@openjdk.java.net

----

h2. First Solution: Use MethodHandles.lookup()

The main features of this solution are:

MH1) A stub class in the generated package is added to the engine codeline.

MH2) Generated classes are loaded into the engine module by calling support code in java.lang.invoke.MethodHandles.
The support code must be called from the stub class in order to anchor the generated classes
in the generated package:

{noformat}
  java.lang.invoke.MethodHandles.lookup().defineClass(generatedClassBytes)
{noformat}

MH3) An extra permission is required for the engine jar when running under a security manager:

{noformat}
  permission java.lang.RuntimePermission "defineClass";
{noformat}

----

h2. Second Solution: Study TemplatesImpl

The main features of this solution are:

TI1) A separate module is created for each generated class.

TI2) The engine module exports the necessary packages to the generated module.

TI3) An extra permission is required for the engine jar when running under a security manager:

{noformat}
  permission java.lang.RuntimePermission "getProtectionDomain";
{noformat}

I need to solve the following problems:

TIP1) I don't know how to get the module names of jar files which are loaded into the database.
This could be solved if we divined the jar file names when the jars are loaded into the database
by SQLJ.INSTALL_JAR/SQLJ.REPLACE_JAR and then stored the names in SYS.SYSTABLES.

TIP2) I think that there is a slow memory leak when the generated class is garbage-collected
and the generated module and extra exports are orphaned. I don't know how to remove this extra
garbage. But maybe we could plug the leak by generating only one module for each database.
We would need to make sure that all of this machinery is unloaded when the embedded driver
is de-registered.

----

h2. First Message to core-libs-dev@openjdk.java.net

I am looking for advice about how to tighten up module encapsulation while generating byte
code on the fly. I ask this question on behalf of Apache Derby, a pure-Java relational database
whose original code dates back to Java 1.2. I want to reduce Derby's attack-surface when running
with a module path.

First a little context: A relational database is an interpreter for the SQL language. It converts
SQL queries into byte code which then runs on a virtual machine embedded in the interpreter.
In Derby's case, the virtual machine is the Java VM and the byte code is simply Java byte
code. That is, a Derby query plan is a class whose byte code is generated on the fly at run
time.

I have converted the Apache Derby codeline into a set of jigsaw modules: https://issues.apache.org/jira/browse/DERBY-6945.
Unfortunately, I had to punch holes in the encapsulation of the main Derby module so that
the generated query plans could call back into the Derby engine. That is because, by default,
generated query plans load into the catch-all, unnamed module. Note that all of these generated
classes live in a single package which does not belong to any named module.

1) Is it possible to load generated code into a named module?

2) Alternatively, can someone recommend another approach for preserving module encapsulation
while generating classes on the fly?

I would appreciate any advice or examples which you can recommend.

Thanks,
-Rick

----

h2. Second Message to core-libs-dev@openjdk.java.net

Thanks again to Rémi and Alan for their advice. Unfortunately, I have not been able to make
either approach work, given another complexity of Derby's class loading. Let me explain that
additional issue.

Derby lets users load jar files into the database. There they live as named blobs of bytes.
The jar files contain user-defined data types, functions, procedures, and aggregators, which
are coded in Java and can be used in SQL statements. Derby lets users wire these jar files
into a custom classpath which drives a custom ClassLoader at query-execution time. I have
not been able to make this custom ClassLoader work with either Rémi or Alan's approach. Note
that a Derby engine manages many databases and each database can have its own custom ClassLoader.

I like the simplicity of Rémi's approach:

  java.lang.invoke.MethodHandles.lookup().defineClass(generatedClassBytes)

This approach does indeed put the generated class where I want it: inside the Derby engine
module. Unfortunately, the ClassLoader of the generated class is the application class loader.
I can't figure out how to force the generated class to use the custom ClassLoader instead.
As a consequence,  the generated class cannot resolve user-defined functions which live inside
jar files in the database. Poking the customer ClassLoader into the thread's context class
loader before calling MethodHandles.lookup() doesn't work.

Alan's approach is a bit more complicated. It involves following the pattern in com.sun.org.apache.xalan.internal.xsltc.trax.TemplatesImpl.
It involves generating a temporary module for each generated class and then adding more export
directives to the engine module so that the generated module can call back into the engine.
I have to say I'm a little confused about the implications of slow memory leaks with this
approach. I don't know what happens to these generated modules and export directives when
the generated class is garbage-collected.

More immediately, however, I am up against the same problem which plagues Rémi's approach:
how do I get the generated module to resolve classes in the custom ClassLoader? More specifically,
I am stuck trying to get the generated module to require the user-written modules, that is,
the user-written jar files. What I am missing is the ability to retrieve the module names
of these jar files so that I can craft requires directives. The only way I know to get a module
name is to use ModuleFinder.of(Path...). Unfortunately, the Path interface is an abstraction
for file systems and is not a good fit for locating a blob of bytes stored inside a database.

I would appreciate any further advice about how to get over these speed bumps.

Thanks,
-Rick


> Investigate putting generated classes under the engine module loader
> --------------------------------------------------------------------
>
>                 Key: DERBY-7006
>                 URL: https://issues.apache.org/jira/browse/DERBY-7006
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.15.0.0
>            Reporter: Rick Hillegas
>            Priority: Major
>         Attachments: derby-7006-01-aa-remiForax.diff, derby-7006-01-ac-alanBateman.diff
>
>
> Right now, the generated query plans are compiled into the catch-all unnamed module.
This forces us to grant reflective access to several engine packages. It would be nice to
encapsulate the generated classes inside the engine module loader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message