db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <de...@segel.com>
Subject RE: "generated by default" question
Date Thu, 29 Jun 2006 20:06:22 GMT

> -----Original Message-----
> From: Daniel John Debrunner [mailto:djd@apache.org]
> Sent: Thursday, June 29, 2006 1:37 PM
> To: Derby Discussion
> Subject: Re: "generated by default" question
>  --- derby@segel.com wrote:
> >>
> >> There is a very simple solution.
> [solution details snipped]
> >>
> >> This is the simplest solution, lowest cost solution,
> >> and doesn't violate the
> >> definition of an SQL sequence.
> Tthis solution creates a couple different
> problems that are probably unexpected by the user.
Not really.

You need to think this through...

First, lets all be clear that we're talking about creating a solution that
will work in the general case. No matter what solution is implemented,
someone, somewhere at some time will figure out a way to break it. Having
said that... the goal is to find a solution that will work in the general
case under reasonable circumstances.

The current implementation of an Identity column in Derby is really more
problematic in that for any insert on a table that has an identity column,
you have to then trap for the specific exception of a non unique value in
the identity index and then figure out your next move....
(Not a good idea.)
What are you going to do? 
In each program, write the logic to reset the sequence number? (NOT GOOD)
In each program, loop until you can successfully insert a row? (NOT GOOD)

> 1) An insert with a non-generated value near or at the
> end of the range of generated values will exhaust the
> generator much earlier, leading to failed inserts that
> would succeed under the current scheme.
> E.g. if the range of of the generator was 1-10
> and the CBV was 3, then inserting a non-generated
> value of 10 means that no more inserts with generated
> values would work. In the current scheme six more
> inserts would succed (4-9).
[mjs] Ok, I was working under the assumption that you couldn't set the max
value of the sequence within Derby. A simple fix would be to check for the
condition "if the inserted value is larger than the maximum allowable
sequence number, then ignore..." 

Thus in your example, nothing happens to the sequence if you insert a row
with a value > 10. 

But your case does bring up an important point.
The simple solution to guarantee a unique value will only work for the first
cycle.  This is true of IBM's IDS too. Using a Serial8, if you insert a row
with a value of 2^63 -1, then you've gone through the first cycle.

In your example, if you inserted a row at 10, then you've hit the last value
in the sequence and if you reset the sequence number, you're going to have

So in the real world, if you have an integer value for your sequence, you
will have the ability to insert a lot of rows prior to having a problem.
Sure, you can have a joker who throws out the 2^63 -1 number, but that is
not the norm....

Remember we're looking for a solution that works best in most cases.

> 2) An insert with a a non-generated value that is
> subsequently deleted or updated before any insert with a
> generated value affects the generator when under the
> current scheme it would not. This will reduce the
> range of available values when there is no need to.
> E.g. if the range of of the generator was 1-10
> and the CBV was 3, then inserting a non-generated
> value of 10 means that no more inserts with generated
> values would work, even if the row with 10 is deleted
> before any other inserts occur. In the current scheme
> seven more inserts would succed (4-10).

Already solved your first example. Just reset the sequence if the row has a
value in the identity column between CBV and MAX. 
Oops! Guess that blows away your theory.

Of course you did point out another potential problem. What happens if you
update the value in the identity column. Again there you'd have to perform
the same check.

> So solved one scenario and caused (at least) two new ones.
> Dan.

Must be that "new math" that was hyped in the 80's and 90's at work...

The current implementation fails miserably. 

Suppose you create a table with an identity column and it has the "GENERATE
BY DEFAULT" option selected.

You insert rows 1... 10 using the default sequence generator. Next you have
someone inserting data from another source that already has values in the
identity column. Lets say starting at 100 and going all the way through

What happens when you want to use the sequence generator?
Well sure, you can now insert rows 11 through 99.
But then look what happens.
You then have to trap for the exception and retry to insert the row.
This has to be written in every single app that hits your database.

So if you have only 1 app hitting your database, it will be stuck trying to
insert a row in to the database for the 10,000 times trying to find a row.
(Now think about multiple apps. Lots of unnecessary overhead.)

Now that's really good logic at work.

Oh and what about having your app reset the sequence number after a failure?
Well, you run in to the same issue. You need to find the next available row.
There are a couple of ways, however my posted solution is the simplest and
fastest method. 

Hey don't take my word. Check out how other databases which are also SQL
compliant handle this.

The bottom line: If I can't trust Derby to auto generate a value and
successfully insert a row, then I can't rely on this function will not use

Seems to me, you're just too darn lazy to fix this bug.
(Note: IBM does sell support contracts for Derby/Cloudscape. If there is
anyone who has Cloudscape on their PPA, feel free to log a call with support
and make it a sev 1. ;-)

But hey! What do I know? 
I've been called a troll over this one. ;-)
Initial problem still exists. Proposed solution (with that mod) still holds


View raw message