geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dain Sundstrom <d...@coredevelopers.net>
Subject Re: Geronimo Deployment Descriptors -- and premature optimisation
Date Tue, 09 Sep 2003 20:22:47 GMT
Alex in an embedded system you may only have a small prom available to  
boot from.  Specifically, you may want to ditch the 1 meg for an XML  
parser.  You may also have a small amount of memory and XML parsers are  
known for being memory pigs (which is fine in a normal server).

Also we need to be open to other persistent forms.  The XML document is  
simply a persistent form of data and we need to be open to other  
persistent forms.

-dain

On Tuesday, September 9, 2003, at 01:53 PM, Alex Blewitt wrote:

> On Tuesday, Sep 9, 2003, at 17:44 Europe/London, Jeremy Boynes wrote:
>
>>> However, it doesn't necessarily mean that it can't generate the XML,
>>> rather than a binary-compatible format that Jeremy was suggesting. An
>>> XML document will always be more portable between versions than a
>>> generated bunch of code, because when bugs are fixed in the latter  
>>> you
>>> have to regenerate, whereas with an XML file you don't.
>>
>> Please do not think I am thinking binary is the only way to go - that  
>> notion
>> was discarded back in EJB1.0 days. What I want is to have it as an  
>> option.
>
> Can I make a few observations here:
>
> o Assumption: large XML files take long time to parse, therefore the  
> server will be slow to start up
> o Assumption: the way to solve that is with the deploy tool, and  
> possibly a combined XML+binary format.
>
> I think there are other solutions to the problem than just these.  
> Whilst it is true the XML file parsing can take some time, it's not  
> actually likely to be where the amount of time is taken up in the  
> server. If we had metrics to prove it, I'd shut up, but we don't.
>
> I'd postulate that we would be able to fire up the server faster if we  
> used a different optimisations; for example, a multi-threaded startup  
> (like provided by Avalon) instead of a single threaded model; an  
> on-the-fly parse of the XML file instead of into a DOM/POJO; ditching  
> the JMX later and using Java method calls; and so on.
>
> But we don't *know* that this is where the bottleneck is. It may be,  
> and we can run tests to show that in a simple scenario, option A is  
> faster than option B, but that doesn't mean that that's where the  
> bottleneck will be in the server.
>
> But if it takes (say) 10 or 100 times as long to dynamically create  
> the bean, we are solving the wrong problem. Don't get me wrong, I  
> don't know how much time it takes to create a bean -- but we don't  
> seem to have any profiling to suggest the various options. It could  
> even be the case that a more optimised XML parser would solve the  
> problem, or a different way of creating the POJOs.
>
> I'd also like to disagree that this optimisation should be done by the  
> deployer. Why not have it done by the server when the code is  
> deployed? Sure, you wouldn't want it to happen every time the server  
> starts (like compiling JSPs) -- so dump out a binary representation at  
> the server side, and drop that cache when the application gets  
> redeployed. That way, you still get the fast startup (2nd time  
> onwards) whilst maintaining portability and without having to  
> sacrifice any issues with the developer.
>
>> For example, parsing the XML with
>> full schema validation is a dog - on my machine even a simple file  
>> takes a
>> couple of seconds and a couple of MB and I am concerned about a) large
>> applications with hundreds of modules taking forever to start, and b)
>> applications trying to run with constrained resources. And yes, we do  
>> need
>> to consider these things :-)
>
> But if you had that large an application, how long would you expect it  
> to take up? Realistically, what is the largest size of app you've had  
> to deal with? Most web-apps have just a single servlet these days (ala  
> Struts), so the only issue is with EJBs, and with 1000 EJBs you're  
> still looking at 1k of data/EJB to make a 1MB file. That's a hell of a  
> lot. And do we know how long it takes to deploy 1000 EJBs once the XML  
> file has loaded? Are we seriously saying that we expect that part of  
> the process to take dramatically less than 2s? If not, then the  
> bottleneck isn't going to be at the XML parsing stage.
>
>> We have also had proposals for storing configuration information in  
>> LDAP
>> respositories and relational databases, neither of which would allow
>> vi-style access to the XML. A binary format may well be a better  
>> option for
>> them.
>
> IMHO I don't think that a 'vi' style access for XML is the sole reason  
> to use them. I am personally more a fan of storing the configuration  
> in LDAP, which will be slower still than having it in XML files. But I  
> wanted to raise a big 'no' to a binary file format, including any  
> serialized concepts of MBeans which would then have real difficulty in  
> being interpreted if we ever managed to break away from JMX. No, I  
> don't think it will happen soon, but I can hope :-) See Elliotte's  
> comments on XML and binary at  
> http://www.cafeconleche.org/books/effectivexml/chapters/50.html (or  
> the cached version at  
> http://216.239.41.104/ 
> search?q=cache:oxknzyhXE9MJ:www.cafeconleche.org/books/effectivexml/ 
> chapters/50.html+%22Compress+if+space+is+a+problem%22&hl=en&ie=UTF-8  
> since I couldnt' see it on the former)
>
>> Think of it like JSP: some people want to pre-compile, and this is  
>> *very*
>> common in production environments.
>
> I don't see the two being that comparable. A site may have many  
> hundereds of JSPs with several k of data in them each, and they take  
> (relatively speaking) a long time to parse, translate, and then  
> compile. I don't see that parsing an EJB-JAR.xml file in the same  
> order of magnitude.
>
> I don't disagree that we can cache an internal form to optimise  
> speedup; I just don't think it should be anything the deployment tool  
> should use. Same with JSPs; we can upload them into Geronimo, and then  
> a background process can pre-compile them when resources are  
> available. I don't think we should force the developer to decide  
> between the two. [What other JSP engines get wrong is that it's  
> necessary to precompile all JSPs before deployment. It's not; they  
> just need to be compiled before the user sees them. The process should  
> be Deploy -> run app -> precompile all possible next JSPs that you can  
> move to.]
>
> Premature optimisation is the root of all evil.
>
> Alex.
>
>

/*************************
  * Dain Sundstrom
  * Partner
  * Core Developers Network
  *************************/


Mime
View raw message