On Mon, Jan 26, 2009 at 1:04 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
(Sent by mistake. Restarting ...)

On Mon, Jan 26, 2009 at 5:55 PM, Alex Karasulu <akarasulu@gmail.com> wrote:
> A transaction journal (a.k.a. transaction log) is used to rapidly persist
> write operations arriving at the server before being processed.  Local
> transactions for these operations are opened to apply the change. This
> allows the server to replay operations which did not complete due to some
> interruption during processing. It also allows the server eventually to
> apply changes and their side effects (like those from triggers), in the same
> transaction which allows all to rollback together or to succeed together.
> We need these things eventually even though they may not be critical right
> now for replication.

It may be a way to implement local transactions in ADS. The TxLog will
be seen as a 'buffer', or temporary storage, until the transaction is
considered as terminated (either committed or rollbacked). Until then,
the backend is not updated. We still have to manage the cache, though,
so it's not a easy path.


> Journal file formats are simple, with indices into these files to track
> those operations that have completed from those that have not.  Journals are
> not ideal for a search-able history retrieval system to be used for auditing
> and snapshotting.  The history CL requires many more indices and it's
> information needs to be more structured.  Conversely search-able CL is not
> ideal as a transaction log since organizing the information and updating all
> these indices requires disk operations which take too much time.
>
> We have to be clear on what we want as a set of requirements.  If we're
> going to implement a transaction journal/log here's what I'd like to see:
>
> 0). Very fast write of operation information to disk including any
> information needed to rollback an operation.

The most important part is to have all the operation written down on
disk, as fast as possible, even if we haven't build the index yet. We
may have a burst of modifications, and we want to handle this burst
smoothly, without being slowed down by any other operations, like
managing index.

Every write must be flushed to disk, and that could be a real
bottleneck. Howl might be of interest here (David may tell us if this
is really the best tool for such a purpose).

> 1). The journal should be the basis for implementing local transactions and
> indices into it should be minimized for performance sake.

I would say that transaction management is not my #1 priority atm.
However, I want to keep doors open. Now, regarding index, I don't
really care if it takes time to build, as soon as it's not a
bottleneck. I think (but this is just my guts speaking)  that index
can be built by a seperate thread, as soon as the other operations
(namely, search, etc) are not slowed down by it. Replication can be
impacted, by in a MMR system, as you have no guarantee that the full
system will reamin consistent, that's not such a big deal. (again, I'm
not raisonning here, I'm just talking about how I feel those things).

> 2). The transaction log should be pruned asynchronously removing operations
> that have been processed.  These operations can then be pumped into the CL,
> for audit history and snapshotting.

So we have :
- a transaction log, used for transaction and replication
- a change log used to revert operation on demand
- a journal used for the RDS only

The txn log and journal are the same thing no?

Alex