commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Akolkar" <rahul.akol...@gmail.com>
Subject Re: [scxml] issues for large sims?
Date Sat, 22 Apr 2006 17:26:02 GMT
On 4/21/06, Wait, David L             PWR <David.Wait@pwr.utc.com> wrote:
>
> Thanks Rahul for all your responses to questions.
>
> I have been watching SCXML evolve at the w3c site and the scxml commons
> project with much interest.
>
<snip/>

Great, IMO, such overlap is useful at both ends.


> Here's the usecase we are interested in.  We are exploring improving our
> time-domain simulations of power generation and distribution networks
> having very many interacting objects (at least thousands).  Imagine a
> simulation very similar to a scaled-up version of the stop-watch
> example; maybe a dozen different kinds of "stop-watches", each having
> thousands of instances interacting with each other in a way that depends
> on the current status of their finite-state-machines.  Other kinds of
> classes would be designed to solve a network of physics-based equations
> where the "knowns" depend on the current statuses of states and the
> "unknowns" are other properties evolving over time.  Most events are
> triggered by the objects themselves; others may be triggered through
> UIs.
>
<snap/>

Quite interesting, I believe that the value of using a well-defined
state chart notation like SCXML grows as the size and complexity of
the system being modeled.


> Given my limited description, do you foresee any issues to watch out for
> in this kind of application?
<snip/>

You ask hard questions ;-)

SHORT ANSWER:

I haven't done any profiling on a scale of magnitude even close to
what you are talking about. You're probably aware that until about
last week, the [SCXML] component in Jakarta Commons was considered to
be a sandbox component (it was promoted out of sandbox earlier in the
week). While efforts were ofcourse made to write efficient library
code, the primary focus up until now has been correctness, and will
probably continue to be that way atleast for a while. I suggest doing
some experiments on a smaller scale so you can judge the scalability
of the library for yourself. We will very much appreciate if you
report back any inefficiencies you discover.

LONG ANSWER (may contain obvious statements, sorry about that):

Scalability is affected by many factors, efficiency of the underlying
library is only one of them. While dealing with orders of magnitude
you mention above, some of the assumptions we have to make so we can
focus on [SCXML] are:

 * You have hardware to match
 * You have middleware to match, and is "configured for efficiency"
 * The application code is well-written

Therefore, a suitable path to using the Commons SCXML implementation
for your endeavor would probably be:

 (a) "Quickly" design a prototype system with only a few flavors (say
couple of smaller state machines, instead of a dozen larger one) and
fewer instances (say a hundred or two active state machine instances)
 (b) Employ good application authoring practices (such as creating an
executor instance only when needed, and disposing an instance once it
runs to completion, etc.)
 (c) Simulate, test performance, profile if unsatisfactory, report
findings here, submit patches to dev list etc.
 (d) Probably iterate (a) through (c) a few times

This would actually be very helpful for the community, since this has
to be done only so often, and benefits everyone.

That brings us to what we do know about Commons SCXML today from a
performance PoV. I have done some minimal profiling, and nothing
really has stood up as an alarm until now. I took this opportunity to
post some CPU times on a 1.4.2 HotSpot JVM [1] for the standalone
command line class StandaloneJexlExpressions [2] running the
microwave-01 [3] sample (since you've probably seen this in the W3C
WD) through a couple of "cook cycles".

The results are pretty much as expected, IMO. Some commentary:

 * The org.apache.{crimson,xalan,commons.{beanutils,digester,scxml.io}}.*
packages have to do with SCXML IO, parsing and serialization is
expensive. However, the Commons SCXML model is now stateless (thanks
to Tim O'Brien for the timely nudge), meaning in your above usecase of
12 types of state machines each having a 1000 instances, this cost is
incurred only 12 times, instead of 12000. Thus, we've gone from paying
a linear price to a (low) constant price. I suspect many of the String
operations we see are also tied to the SCXML IO bits, and therefore
have similar constant costs.

 * The org.apache.commons.jexl.* packages have to do with expression
evaluation (in this document, we're using JEXL [4] expressions).
Expression language parsing is also expensive, but there is not much
Commons SCXML can do about it.

 * The {org.apache.commons.logging.*,java.util.logging.*} are logging
overheads. This particular test class uses extensive logging,
including a simple (purely logs callbacks) SCXMLListener,
EventDispatcher and Tracer. Adding these logging bits is an
application dependent choice, though I don't think the logging
overheads are significant enough to lose their value-add in any case.

Those are pretty much the relevant entries from an [SCXML]
perspective, IMO. Therefore, I haven't felt the urge to dig any
deeper.

-Rahul

(long, possibly fragmented URLs below)

[1] http://people.apache.org/~rahul/commons/scxml/cpu-times.txt
[2] http://svn.apache.org/viewcvs.cgi/jakarta/commons/sandbox/scxml/trunk/src/main/java/org/apache/commons/scxml/test/StandaloneJexlExpressions.java?view=markup
[3] http://svn.apache.org/repos/asf/jakarta/commons/sandbox/scxml/trunk/src/test/java/org/apache/commons/scxml/env/jexl/microwave-01.xml
[4] http://jakarta.apache.org/commons/jexl/


>
> --Dave
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message