db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: RowLocation validation, for holdable SUR
Date Mon, 27 Feb 2006 17:05:01 GMT

Andreas Korneliussen wrote:
> Mike Matrigali wrote:
>> The SUR should not know anything about the underlying implementation
>> of the access method getting the row, so having it "read a timestamp"
>> from page does not work. If the timestamp is not in the rowlocation,
>> we could add a get a timestamp for row at this rowlocation, but forcing
>> two trips to the store for every row is a overhead.  Rather than discuss
>> implementation it would be nice to understand the minimum necessary
>> services needed to be provided by the access method.  Do the same 
>> interfaces need to be provided by VTI's?  At least
>> for your use I think the timestamp need only guarantee to be different
>> after a truncate from previous version on page.
>> Since you are ok with invalidating the SUR in the case of offline 
>> compress, what about invalidating the SUR in the case of online
>> compress also?  One way to do this is for the system catalogs to
>> maintain a table version number, which would be guaranteed to not
>> change while any sort of table intent lock was present.  Any operation
>> which either copied rows to another container or truncated the
>> table would bump the version number.  And holdable cursors would need
>> to recheck the version number after losing the lock at commit time.
> I think I could go for the following solution to invalidate the SUR in 
> case of online compress:
> - A sequence number is associated with each Container
> - The sequence number is updated when doing truncate
> A holdable cursor will need to reopen the controller after a commit, 
> since the controllers get closed at the end of the transaction (in 
> closeForEndTransaction(..)).
> When reopening a controller, one may check that the sequence number has 
> not been changed since it was initially opened. If it has changed, one 
> can conclude that there has been a online compress, and updates cannot 
> be safely executed, and we may reject the reopen.
> Any attempt to do update on a non-reopened controller, will fail, and a 
> warning given (cursor operation conflict).
> This solution does not have the downside of requiring any changes to the 
> page layout, or RowLocation. It also does not have a cost per row. The 
> downside, is that a online compress will invalidate the cursor from 
> doing any update, even for rows which are unaffected of the truncate.
> Note: the ScrollInsensitiveResultSet does not need to know anything 
> about the sequence number.
> Andreas
This sounds like a good direction.

I was suggesting that the sequence number be maintained in the system
catalogs and owned by upper layer of the system.  It seems like you are
proposing the sequence number be owned by store.  If owned by store
I think I would describe the sequence number something like:
     An implementation specific long which will be changed to a never
     previously used number if the table undergoes a change which
     results in the possibility of a RowLocation which was previously
     allocated being reused in a container which was built requesting
     no RowLocation reuse.

Can you explain at what point, and in which part of the code does the
system check that the sequence number has changed and then fail the
SUR?  If only for SUR then there will be some querying from SUR to
store after every commit.  If only in store then the closing will affect
existing holdable cursors.

>> The downside is that some SUR's are invalidated that didn't need to be,
>> but compress kicking in, in a holdable cursor in the time between a 
>> commit and then next operation in the cursor is going to be a
>> rare event.  The upside is that there is no extra per row overhead in
>> the system for the normal case.
>> There already exists a ddl invalidation scheme for invalidating query
>> plans, maybe this existing structure could be used to invalidate
>> SUR's after the commit?
>> Andreas Korneliussen wrote:
>>> I will modify the suggestion somewhat. I think first, that offline 
>>> compress is not a problem, even for the holdable SUR. Since offline 
>>> compress moves the records to another container, the SUR cursors 
>>> should  detect that container they use is no longer valid, when 
>>> renavigating to the row.
>>> If a client of store moves a row by deleting and inserting it 
>>> somewhere else, the SUR should not find the row when trying to do 
>>> renavigate to it for update or delete, and can give an error.
>>> What our problem is, is the case where a row is inserted into the 
>>> container, and it gets the same RowLocation as a row which we have 
>>> read into the SUR. The row which we had previously read into the SUR, 
>>> must have been deleted and purged for this to happen.
>>> In addition, as far as I can see, for a new row to get the same 
>>> RowLocation as a row previously deleted and purged, the page for the 
>>> row, must have been truncated, and recreated.
>>> So then how can we detect that a page has been recreated ? We could 
>>> i.e use a timestamp on the create/recreate time of the page. This 
>>> timestamp could be read by the SUR as it reads the RowLocation (so we 
>>> do not need to change the impl. of RowLocation), and again, we would 
>>> probably need to change the header for the page, so that we can store 
>>> the timestamp.
>>> Andreas
>>> Mike Matrigali wrote:
>>>> Some questions:
>>>> o row locations are stored in every index row.  Are you proposing a 
>>>> data level upgrade of every row in all databases?
>>>> o What is your proposal in the case of soft upgrade (note I believe not
>>>>   supporting "holdable" SUR in soft upgrade is an option).
>>>> o The hard case is the compress case that removes pages from a file, in
>>>>   this case there is no place to store the version number that you
>>>>   are relying on (the same problem in the current system why truncte 
>>>> can't support non-reusable rowlocations).
>>>> o Is it worth the on disk and in memory overhead to every row 
>>>> location to support holdable SUR?
>>>> I believe one of the operations you are trying to address is when a 
>>>> client of store moves a record by deleting and inserting it.  This is
>>>> what compress does today.  So if we start with row loc A pointing at
>>>> row A, and compress deletes row A and inserts it at row loc B.  In both
>>>> the current and new system access to A will return an error, but 
>>>> neither
>>>> will "know" that the row has been moved to a new ID.  Is this ok?
>>>> If the current system always supported non-reusable row id's, even in
>>>> the truncate case do you have what you need?  Again this will not 
>>>> prevent clients of store from moving a row by inserting and deleting
>>>> it somewhere else.
>>>> Andreas Korneliussen wrote:
>>>>> Following is a proposal to ensure that a client of store can verify 
>>>>> the validity of a RowLocation.  A RowLocation has become invalid if 
>>>>> a store operation has caused it to point to another row or to a 
>>>>> non-existent position (deleted row or non-existing page/record-id).
>>>>> I think we need a mechanism to detect that a RowLocation has become 
>>>>> invalid in order to implement *holdable* SUR.
>>>>> To do this, I would propose:
>>>>> - The RowLocation object should contain a version number for the page.
>>>>> - A version number should be stored in the header for a Page
>>>>> - Whenever an operation which may invalidate row-locations is 
>>>>> executed, the version number for the page is updated. These 
>>>>> operations include online/offline compress.
>>>>> - When navigating to a RowLocation which has invalid version 
>>>>> number, the store may fail (i.e return false)
>>>>> The page header for a stored page, currently has a number of fields 
>>>>> which are intended for future use, and it seems that it is possible 
>>>>> to use these fields without breaking backward compatibility.
>>>>> I noticed one of the fields in the header is named "generation" 
>>>>> (from StoredPage.java):
>>>>>      *  4 bytes integer    generation      generation number of 
>>>>> this page(FUTURE USE)
>>>>>      *  4 bytes integer    prevGeneration  previous generation of 
>>>>> page (FUTURE USE)
>>>>> Could I use the generation field for this, or has it been reserved 
>>>>> for something else ? Alternatively, I could use one of the other 
>>>>> long fields reserved for future use.
>>>>> Any comments ?
>>>>> Thanks
>>>>> --Andreas

View raw message