commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitri Plotnikov <dmi...@apache.org>
Subject Re: [jxpath] Java code from JXpath expression
Date Wed, 08 Mar 2006 15:58:00 GMT
Let me add a few observations based on my my admittedly fading knowledge of JXPath.

1. I believe that byte code generation would improve JXPath performance in most cases. JXPath
has the overhead of handling heterogenous object models.  It needs to allocate and maitain
NodePointers and such, which is very expensive.  And, of course it uses reflection (and yes,
it caches as much of it as it can).  XML-based systems like XSLT don't have such overhead,
so they are likely to perform better.  I am sure much of JXPath's overhead could be avoided
through code generation.  Not all though: you would still need to fall back on generic code
in unpredictable cases like untyped Maps, Lists, properties of type "Object", containers,
custom object models  etc.

<aside>
I was astonished to see that converting a static HashMap to a highly optimized generated tree
of switch-statements does not meet expectations.  I built such a generator and discovered
that even after factoring out the code generation itself, performance of the generated code
was several times worse than that of the HashMap class.  

How about a challenge: given an arbitrary list of string-key/value pairs, build byte-code
that will consistently beat the corresponding HashMap object.  It somebody succeeds at this,
it could be generally useful for the community.  I'll be happy to post my code generator so
you could poke holes in it.
</aside>

2. It is certainly _possible_ to build a compiler from XPath to byte code.  It will be more
complex than the one in XSLT, because it will have to do what XSLT does plus handle non-XML
based object models.  All I am saying, is that it would be very difficult to build such a
code generator and would require a team of developers.  So far I've been handling JXPath by
myself, but this task is way beyond the time I could personally contribute to JXPath.

3. Even with a compiler you can always write a poorly performing XPath.  The issue is the
same as with databases: if your query requires scans through large collections of objects,
you'll get poor performance, compiler or no compiler.  So, always make sure that the XPath
itself is optimized.  Use the map() and id() functions as much as you can.  You can also use
custom extension functions that would take care of the expensive steps.

All that said, JXPath-bytecode compiler sounds feasible though difficult and would certainly
make an exciting project.  Looking for volunteers!

Cheers,
- Dmitri

----- Original Message ----
From: Torsten Curdt <tcurdt@apache.org>
To: Jakarta Commons Users List <commons-user@jakarta.apache.org>
Sent: Tuesday, March 7, 2006 11:07:46 PM
Subject: Re: [jxpath] Java code from JXpath expression


On 08.03.2006, at 14:31, Dmitri Plotnikov wrote:

> Hi,
>
> I have thought about using this type of compilation to byte code.  
> Unfortunately, I got mired in the tremendous _potential_ complexity  
> of data models that JXPath works with.   Imagine that you have an  
> expression that needs to traverse a path through a graph with the  
> root in JavaBean, followed by a property that is declared to be of  
> type "Object". What type is it really? We don't know 'till the  
> runtime, at which point we discover that it really is a Map that  
> contains a List that contains an object handled by a custom  
> NodeFactory.  That NodeFactory takes us to a Container that  
> resolves into a DOM object. And then we do a few steps through the  
> DOM tree.  That last step kind-of resembles XPath :-)
>
> I just could not figure out how to translate all of this complexity  
> to Java or byte code reliably.

Well if you can generate source code that compiled really improves  
performance then surely you can also directly generate the byte code  
instead ...no idea how the generated code needs to look like though.  
TBH I don't have much of a clue how jxpath does its magic ;)

>>> IMHO that's an ugly approach ...rather I would try to improve  
>>> jxpath.
>>
>> You are certainly entitled to your opinion :-)

Thanks ;)

>> Maybe because my sense of aesthetics is skewed I fail to see the  
>> ugliness you speak of, but I'm more than willing to be educated.

It's just unnecessary overhead. Besides I am not a big fan of source  
code generation in general (mostly because it's a one-way street) You  
should ask yourself "who will ever look at the code"? ...if no one  
does it does not make much sense to pipe it through an compiler IMO.

>>> When caching reflection you can get the same speed as native (at  
>>> least
>>> in some areas).
>>
>> Could you please elaborate on this...  I don't understand what  
>> exactly should be cached.
>> The result of the evaluation? Maybe I'm missing something but if  
>> the Context object is always changing is there a point in "caching  
>> reflection"?

Obtaining the Method and Field objects is expensive.
Once you have them - keep them! ...and you will save quite some cpu  
cycles. If jxpath already does that I am not sure the source code  
generation detour will give you the speed improvement you are looking  
for.

>>  Another option would be to generate byte code internally
>>> (like XSLTC does)
>>
>> Ok... that sure is a possibility,  any pointers on how/where to  
>> start?

First you need to find out what the code needs to look like! Compare  
and see whether it really makes a huge difference - or is possible at  
all.

Then have a look at BCEL, ASM and of course XSLTC

>> Although I'm a heavy user of JXPath I'm ashamed to admit that I  
>> barely know it's inner workings

Same here ... Dmitri just did a too good of a job :)

>>  Going through the source code stage just to avoid
>>> reflection speed sounds like the wrong approach to me.
>>
>> It's just a possibility, what I really want is not to use reflection.

...if that is really possible at all.

cheers
--
Torsten



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message