db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kristian Waagan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-5367) Stale data retrieved when using new collation=TERRITORY_BASED:PRIMARY feature
Date Thu, 22 Sep 2011 14:10:27 GMT

     [ https://issues.apache.org/jira/browse/DERBY-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kristian Waagan updated DERBY-5367:
-----------------------------------

    Attachment: derby-5367-4b-fix_with_optimization_improved.diff

Mike,

My replies below, to your comments above.

 o I've changed this a bit again, see patch 4b.
   I now detect if there are columns with other collations than UCS BASIC
   when the conglomerate is created or opened. By tacking on to
   ConglomerateUtil.readCollationIdArray this comes almost for free (need
   one more boolean field in the conglom classes though), and we just have
   to iterate through the ids once upon creation.
   Derby always has to do an if on hasCollatedTypes(). This could maybe be
   made final with some refactoring (not sure if that would matter though).

 o Yes, the new test I wrote, based on the bug report, will be included.
   It is now part of patch 4b.
   There are no other tests stressing this piece of code in suites.All.

 o Please report this. It looked suspicious to me the first time I saw it,
   but I don't have the knowledge to add any more context.
   FYI, the test jdbciapi.SURQueryMixTest triggers that piece of code.
   (SUR = scrollable updatable resultset)

 o Noted. It would be good to log a JIRA for this if not already done.

Now that it looks like there might be another RC, I'd like to get a fix for
this issue included. Unfortunately, I'll be away next week. I'll commit this
tomorrow if I don't get any pushback before then.
If I commit and things go awry, back out the fix (it will be a single commit only).

> Stale data retrieved when using new collation=TERRITORY_BASED:PRIMARY feature
> -----------------------------------------------------------------------------
>
>                 Key: DERBY-5367
>                 URL: https://issues.apache.org/jira/browse/DERBY-5367
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.8.1.2, 10.9.0.0
>         Environment: Mac OS X, Windows
>            Reporter: Brett Wooldridge
>            Assignee: Kristian Waagan
>            Priority: Critical
>         Attachments: derby-5367-1a-update_row_fully.diff, derby-5367-1b-update_row_fully.diff,
derby-5367-1b-update_row_fully.stat, derby-5367-2a-minimal_fix.diff, derby-5367-3a-update_field_by_field_preview.diff,
derby-5367-4a-fix_with_optimization_improved.diff, derby-5367-4b-fix_with_optimization_improved.diff
>
>
> Our product recently upgraded to version 10.8.1.2 in order to take advantage of the new
'case-insensitive' mode offered by Derby in the form of the "collation=TERRITORY_BASED:PRIMARY"
connection parameter.
> Unfortunately, we have run into an issue whereby stale data appears to be retrieved from
an index, even though the data in the table itself has changed.
> You can see this issue in the IJ session below.  The database in question was created
using this Java parameter when invoking IJ:
> -Dij.database=jdbc:derby:test;create=true;collation=TERRITORY_BASED:PRIMARY
> Here is the IJ session:
> CONNECTION0* - 	jdbc:derby:test
> * = current connection
> ij> CREATE TABLE tag (
>     tag_id INTEGER GENERATED BY DEFAULT AS IDENTITY NOT NULL,
>     tag VARCHAR(255) NOT NULL,
>     CONSTRAINT tag_pk PRIMARY KEY (tag_id),
>     CONSTRAINT tag_tag_unique UNIQUE (tag)
> );
> 0 rows inserted/updated/deleted
> ij> -- first insert a value 'Test', note the upper-case 'T' in 'Test'
> ij> INSERT INTO tag (tag) VALUES ('Test');
> 1 row inserted/updated/deleted
> ij> SELECT * FROM tag;
> TAG_ID     |TAG                                                                     
                                                       
> --------------------------------------------------------------------------------------------------------------------------------------------
> 1          |Test                                                                    
                                                       
> 1 row selected
> ij> -- Now delete the row
> ij> DELETE FROM tag WHERE tag='Test';
> 1 row inserted/updated/deleted
> ij> -- You could run another SELECT here to verify it is gone, but it is.
> ij> -- Now insert a new value 'test', note the lower-case 't' in 'test'
> ij> INSERT INTO tag (tag) VALUES ('test');
> 1 row inserted/updated/deleted
> ij> -- Now verify that the table contains only the lower-case version: 'test'
> ij> SELECT * FROM tag;
> TAG_ID     |TAG                                                                     
                                                       
> --------------------------------------------------------------------------------------------------------------------------------------------
> 2          |test                                                                    
                                                       
> 1 row selected
> ij> -- Now, here is the bug.
> ij> SELECT tag FROM tag;
> TAG                                                                                 
                                           
> --------------------------------------------------------------------------------------------------------------------------------
> Test                                                                                
                                           
> 1 row selected
> ij> 
> Note in the last SELECT we specify the 'tag' column specifically.  When we 'SELECT *',
Derby performs a table-scan and the result is correct.  However, when we 'SELECT tag', Derby
appears to use the index created for the 'tag_tag_unique' unique constraint.  As an optimization
Derby, like many databases, will use values directly from the index in the case where the
index covers all requested columns.
> The bigger question is, why doesn't the DELETE action cause the entry in the tag_tag_unique
index to be deleted?  Is this a further optimization?  If so, it is imperative that the index
at least be updated when the new value is inserted.
> This is rather a severe bug for us that causes stale data to be returned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message