commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nestor Urquiza <nest...@yahoo.com>
Subject Re: [scxml] issues for large sims?
Date Sun, 23 Apr 2006 11:39:11 GMT
About xml parsing a huge state machine could mean huge
xml files and therefore you should consider the use of
xml native databases. 

You have some good ones for free like eXist[1] or
xindice[2]. I have found that eXist handles huge xml
files better than xindice btw.

You have also a lot of commercial ones like Tamino[3]
from SoftwareAG.

[1]http://exist.sourceforge.net/
[2]http://xml.apache.org/xindice/
[3]http://www.softwareag.com/corporate/products/tamino/default.asp

--- Rahul Akolkar <rahul.akolkar@gmail.com> wrote:

> On 4/21/06, Wait, David L             PWR
> <David.Wait@pwr.utc.com> wrote:
> >
> > Thanks Rahul for all your responses to questions.
> >
> > I have been watching SCXML evolve at the w3c site
> and the scxml commons
> > project with much interest.
> >
> <snip/>
> 
> Great, IMO, such overlap is useful at both ends.
> 
> 
> > Here's the usecase we are interested in.  We are
> exploring improving our
> > time-domain simulations of power generation and
> distribution networks
> > having very many interacting objects (at least
> thousands).  Imagine a
> > simulation very similar to a scaled-up version of
> the stop-watch
> > example; maybe a dozen different kinds of
> "stop-watches", each having
> > thousands of instances interacting with each other
> in a way that depends
> > on the current status of their
> finite-state-machines.  Other kinds of
> > classes would be designed to solve a network of
> physics-based equations
> > where the "knowns" depend on the current statuses
> of states and the
> > "unknowns" are other properties evolving over
> time.  Most events are
> > triggered by the objects themselves; others may be
> triggered through
> > UIs.
> >
> <snap/>
> 
> Quite interesting, I believe that the value of using
> a well-defined
> state chart notation like SCXML grows as the size
> and complexity of
> the system being modeled.
> 
> 
> > Given my limited description, do you foresee any
> issues to watch out for
> > in this kind of application?
> <snip/>
> 
> You ask hard questions ;-)
> 
> SHORT ANSWER:
> 
> I haven't done any profiling on a scale of magnitude
> even close to
> what you are talking about. You're probably aware
> that until about
> last week, the [SCXML] component in Jakarta Commons
> was considered to
> be a sandbox component (it was promoted out of
> sandbox earlier in the
> week). While efforts were ofcourse made to write
> efficient library
> code, the primary focus up until now has been
> correctness, and will
> probably continue to be that way atleast for a
> while. I suggest doing
> some experiments on a smaller scale so you can judge
> the scalability
> of the library for yourself. We will very much
> appreciate if you
> report back any inefficiencies you discover.
> 
> LONG ANSWER (may contain obvious statements, sorry
> about that):
> 
> Scalability is affected by many factors, efficiency
> of the underlying
> library is only one of them. While dealing with
> orders of magnitude
> you mention above, some of the assumptions we have
> to make so we can
> focus on [SCXML] are:
> 
>  * You have hardware to match
>  * You have middleware to match, and is "configured
> for efficiency"
>  * The application code is well-written
> 
> Therefore, a suitable path to using the Commons
> SCXML implementation
> for your endeavor would probably be:
> 
>  (a) "Quickly" design a prototype system with only a
> few flavors (say
> couple of smaller state machines, instead of a dozen
> larger one) and
> fewer instances (say a hundred or two active state
> machine instances)
>  (b) Employ good application authoring practices
> (such as creating an
> executor instance only when needed, and disposing an
> instance once it
> runs to completion, etc.)
>  (c) Simulate, test performance, profile if
> unsatisfactory, report
> findings here, submit patches to dev list etc.
>  (d) Probably iterate (a) through (c) a few times
> 
> This would actually be very helpful for the
> community, since this has
> to be done only so often, and benefits everyone.
> 
> That brings us to what we do know about Commons
> SCXML today from a
> performance PoV. I have done some minimal profiling,
> and nothing
> really has stood up as an alarm until now. I took
> this opportunity to
> post some CPU times on a 1.4.2 HotSpot JVM [1] for
> the standalone
> command line class StandaloneJexlExpressions [2]
> running the
> microwave-01 [3] sample (since you've probably seen
> this in the W3C
> WD) through a couple of "cook cycles".
> 
> The results are pretty much as expected, IMO. Some
> commentary:
> 
>  * The
>
org.apache.{crimson,xalan,commons.{beanutils,digester,scxml.io}}.*
> packages have to do with SCXML IO, parsing and
> serialization is
> expensive. However, the Commons SCXML model is now
> stateless (thanks
> to Tim O'Brien for the timely nudge), meaning in
> your above usecase of
> 12 types of state machines each having a 1000
> instances, this cost is
> incurred only 12 times, instead of 12000. Thus,
> we've gone from paying
> a linear price to a (low) constant price. I suspect
> many of the String
> operations we see are also tied to the SCXML IO
> bits, and therefore
> have similar constant costs.
> 
>  * The org.apache.commons.jexl.* packages have to do
> with expression
> evaluation (in this document, we're using JEXL [4]
> expressions).
> Expression language parsing is also expensive, but
> there is not much
> Commons SCXML can do about it.
> 
>  * The
> {org.apache.commons.logging.*,java.util.logging.*}
> are logging
> overheads. This particular test class uses extensive
> logging,
> including a simple (purely logs callbacks)
> SCXMLListener,
> EventDispatcher and Tracer. Adding these logging
> bits is an
> application dependent choice, though I don't think
> the logging
> overheads are significant enough to lose their
> value-add in any case.
> 
> Those are pretty much the relevant entries from an
> [SCXML]
> perspective, IMO. Therefore, I haven't felt the urge
> to dig any
> deeper.
> 
> -Rahul
> 
> (long, possibly fragmented URLs below)
> 
> [1]
>
http://people.apache.org/~rahul/commons/scxml/cpu-times.txt
> [2]
>
http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/scxml/trunk/src/main/java/org/apache/commons/scxml/test/StandaloneJexlExpressions.java?view=markup
> [3]
>
http://svn.apache.org/repos/asf/jakarta/commons/sandbox/scxml/trunk/src/test/java/org/apache/commons/scxml/env/jexl/microwave-01.xml
> [4] http://jakarta.apache.org/commons/jexl/
> 
> 
> >
> > --Dave
> >
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> commons-user-help@jakarta.apache.org
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message