river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patricia Shanahan <p...@acm.org>
Subject Re: datastructure classes
Date Tue, 14 Dec 2010 21:40:18 GMT
On 12/14/2010 8:37 AM, Gregg Wonderly wrote:
> On 12/14/2010 1:36 AM, MICHAEL MCGRADY wrote:
>> I would say that in addition to just be a fast data structure the data
>> structure
>  > must be fast and accommodate synchronous and asynchronous backups,
> partitions,
>  > and transactions.
> This is an important issue from the perspective that there are two
> scenarios that used to be supported by outrigger. A persistent and an
> non-persistent version used to exist. The persistent version used PSE
> for serialization to disk. That was a simple yet powerful mechanism. Due
> to licensing (Sun paid for a distribution license), it was in a sense,
> deprecated at the point of River being started.
> For those that don't know about PSE, it used a post compilation bytecode
> manipulator that looked for calls to a "start transaction" method, and
> then found modification assignments to associated data structures, and
> modified the byte code to set a "modified bit" on the associated data.
> When "end transaction" was encountered, it stopped.
> I think it would be a good idea to focus on the performance of the in
> memory (messaging only type of application) version. The persistent
> version is a completely different animal and requires some fairly
> advanced features for managing all of the appropriate control points.
> Making one code path do both can be somewhat challenging from an all out
> performance perspective.

Thanks for the useful background information.

There is one slim hope I can see for a common code path, but it is a
very long way off.

My prejudice, subject to being convinced that another approach would be
better, would be to try to map a persistent version to a relational
database through SQL. Relational databases deal with transactions, ACID,
distribution, and performance issues. There are a lot of options for
users, more than for OO databases, at all price points starting at free.

The way outrigger uses its FastList looks rather like a sort of
simplified relational database, with each FastList instance representing
a table and selects being done by linear scan of the table.

If we made a persistent version use a relational database to represent
the space, we could then experiment with performance run-offs between
our best shot at an ad-hoc in-memory implementation, and what we get
from the persistent version if we drop in an in-memory database
implementation. If they come close, we could drop the ad-hoc
implementation and focus all effort on the relational database version.

It is a slim hope. Often, a custom tuned data structure will out-perform
a specialization of a general data structure. In any case, I agree with
working first on the in-memory version.


View raw message