river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MICHAEL MCGRADY <mmcgr...@topiatechnology.com>
Subject Re: datastructure classes
Date Thu, 16 Dec 2010 20:48:22 GMT
Just a few thoughts below, which are advanced to add to the conversation, not detract from

On Dec 16, 2010, at 8:01 AM, Patricia Shanahan wrote:

> Mike McGrady wrote:
>> Intense network systems like our clients cannot work with database
>> calls.  They are too slow and do not scale.  They either reach a cost
>> or a performance ceiling.
> I feel we are mixing requirements and implementation.

That may be correct, Patricia.  And the distinction is important.

On requirements, they can be seen as non-functional and functional.  The former drive the
latter, I think we will all agree, but maybe not.  I definitely do not think that we can see
if something will work and not consider initially whether or not even if it works it would
be fairly useless except in ho-hum cases.  To ho-hum business users, that may be taken as
a slight.  It is not.  I am just interested in because my clients are interested in high performance.

So, I think the non-functional requirements and related technologies, e.g., clustering, in-memory
access, etc., are primary.  For example, when scaling is at issue, it is not important that
300,000,000 transactions can be handled in 10 hours, say, but the fact that we can start at
100 transactions with economies of scale and then use the system to scale to 300,000,000 with
the same performance is.  This is particularly important in a cloud-based economy and using
economies of scale in equipment, development time, etc.  That said, let me say a few things
and I offer them not as written in stone but as something to think about and maybe agree on,
or disagree on.

First, I should qualify this.  I am speaking off the top of my head and it appears I must
be more guarded and/or more considerate of the time people have to read.  I am a toss it out
and discuss it guy and some people do not have the time for that.  I believe in measure ten
times and cut once, but people have different drivers.  I did not mean we do not do databases.
 Of course we do.  Doesn't everyone?  However, the primary data model and structures have
to be in-memory because we cannot tolerate the time database calls take (in-memory is approximately
10,000 times faster).  I think that this is not only a requirement for me and is not sui generis
but is really a dominant part of the industry that will consider using Outrigger.  Also, unless
we want to give up and succumb to Brewer's Theorem, a multi-tiered architecture with asynchronous
writes to a database with eventual consistency will not do. This does not mean that Outrigger
must not write to a database.  It must, of course, but there are other considerations, as
we all know.

Second, that said, scaling as a requirement is roughly the capacity to continue to add stressors
on the system (users, connections, memory, cpu cycles) and have the performance be the same,
or equal.  If the performance drops, that does not scale, in this definition, which is the
one I am used to.  What scaling does not mean to me is the ability to handle a given humongous
load, humongous given stressors.

Third, I am not sure at this point where we want to go with Outrigger.  I am only interested
and only have time to build a system that will not knock on the door but rather will knock
the door down.  My clients expect no less.  And, ultimately I am not here for academic interests.
 Others might have different interests.  I respect that of course.  I am just stating what
I need.

Fourth, I tend to work as primarily an architect and a designer of system-of-systems off the
non-functional values, the QCC or the "ilities".  This is, for my perspective, the goal and
functional requirements in my mind are defined by and flow from these non-functional requirements.
 So, I might have a different perspective but ultimately we would agree, I think, on the details.

> Response time, throughput, ACID properties and the like are requirements. What Outrigger
uses as persistent storage for tuple space is implementation.
> I would like to get more data on your performance requirements - specifics about e.g.
x% of JavaSpace reads under these conditions must complete within y seconds.

I can answer this in a few days but it really would be as fast as possible until other values
are compromised.  Certainly I can say that I work in complicated and critical systems and
have to use as little time in my piece as possible.  When a radar has to be read, sign on
security passed, multilevel security ensured, upgraded or downgraded structural data handled,
integration logic decoupled, whether at the application or virtual machine basis, etc., and
get to a screen in 1.0 seconds, the more invisible we are the better, except in some respects.

> The most useful and specific way of presenting requirements would be a scalable benchmark
- something where we can simulate a larger number of machines doing real work by having a
smaller number dedicated to driving transactions at a single JavaSpace and measuring the results.

I do not know if you are covering this in what you say, you might be and you probably are,
but do you cover the case where the replication of the data is transactional?  Also, could
you expand on this.  I think you have a great idea here but I am not sure that you have expressed
the whole of it.

> Measurements from a small to medium sized environment would be useful input for estimating
performance at higher loads. Without that sort of data, I cannot even guess whether an achievable
Outrigger implementation will be able to meet all your requirements.

I know that either an asynchronous write (consistency) or a synchronous write (performance)
to a database will not work if that is the end-all and be-all for a number of reasons..  If
you want me to quantify that, I can, but the fact that we can spend millions and do it will
not be adequate.  The cost of machines and development time is important, crucial.  Scaling
is necessary to achieve a consistent cost between high and low levels of stressors to a system,
especially in this cloud-based economies of scale age.

> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037

View raw message