Bernt,
I believe that if you look at the spec, you'll find that the current
implementation within Derby is indeed a bug.
Its your interpretation of 9.21 that I'm calling in to question...
See comments below...
-Mike
> -----Original Message-----
> From: Bernt.Johnsen@Sun.COM [mailto:Bernt.Johnsen@Sun.COM]
> Sent: Monday, June 12, 2006 5:22 AM
> To: Derby Discussion; msegel@segel.com
I would hope to clarify something...
> The identity column as specified in the standard does not guarantee a
> unique value. It gives you a value according to a sequence
> generator. If used properly, then the user may get unique values, but
> he may also generate duplicates or violate constraints and get an
> exception.
>
[mjs]
I should clarify something...
If the user is attempting to write something that would break that would be
one thing. I'm talking about trying something that one should expect to
work....
> You may even write a cyclical identity column like this
>
> GENERATED BY DEFAULT AS IDENTITY (START WITH 0
> INCREMENT BY 1
> MAXVALUE 2
> CYCLE)
>
> Here the generated values for the identity column will be
> 0,1,2,0,1,2,0,1,2,0,1,2....... (better not have a UNIQUE constraint
> on the identity column :-)
>
[mjs]
Ok, That's fine. So after you hit a sequence value of 2, then you can no
longer insert a row in to the table. Depending on your ability to handle the
constraint and the attempt to insert a row, you would throw an exception.
But I'm talking about a situation where if a user inserts a row in to a
table and it has an identity value > than the current sequence, you'll throw
an exception as you increment the sequence and you'll rely on the user to
trap for the exception and to continue to try to insert the row.
That is to say, if the user inserts rows 10 through 10,000 and the sequence
is currently at 9, then the user's application will cycle 9990 times before
a successful insert.
So for every insert in to a table where the column could be a potential
identity column, you'd have a lot of coding to do.
> (Note: Derby does not support MAXVALUE and CYCLE. Derby does also
> require a comma befor INCREMENT).
>
[mjs]
That's fine, however, your example is a good one.
However, that doesn't mean that Derby doesn't have a design defect, and that
there are a couple of solutions that could be implemented in Derby to
attempt to perform the insert and if still impossible, then throw the
exception.
You just don't throw the exception when you first hit a row where the
sequence generates a number that is already in use.... (We'll get to that in
a second...)
> Chapter 4.14.17:
>
> The columns of a base table BT can optionally include not more than
> one identity column. The declared type of an identity column is either
> an exact numeric type with scale 0 (zero), INTEGER for example, or a
> distinct type whose source type is an exact numeric type with scale 0
> (zero). An identity column has a start value, an increment, a maximum
> value, a minimum value, and a cycle option. An identity column is
> associated with an internal sequence generator SG. Let IC be the
> identity column of BT. When a row R is presented for insertion into
> BT, if R does not contain a column corresponding to IC, then the value
> V for IC in the row inserted into BT is obtained by applying the
> General Rules of Subclause 9.21, "Generation of the next value of a
> sequence generator", to SG. The definition of an identity column may
> specify GENERATED ALWAYS or GENERATED BY DEFAULT.
> SG.
>
>
> And chapter 9.21:
>
> 9.21 Generation of the next value of a sequence generator
>
> Function
>
> Generate and return the next value of a sequence generator.
>
> Syntax Rules
>
> None.
>
> Access Rules
>
> None.
>
> General Rules
>
> 1) Let SEQ be the SEQUENCE specified in an application of this
> Subclause.
>
> 2) Let DT, CBV, INC, SMAX, and SMIN be the data type, current base
> value, increment, maximum value and minimum value, respectively, of
> SEQ.
>
> 3) If there exists a non-negative integer N such that SMIN <= CBV + N
> * INC <= SMAX and the value (CBV + N * INC) has not already been
> returned in the current cycle, then let V1 be (CBV + N *
> INC). Otherwise,
>
[mjs] BINGO!
> Case:
>
> a) If the cycle option of SEQ is NO CYCLE, then an exception
> condition is raised: data exception - sequence generator limit
> exceeded.
>
> b) Otherwise, a new cycle is initiated.
>
> Case:
>
> i) If SEQ is an ascending sequence generator, then let V1 be
> SMIN.
>
> ii) Otherwise, let V1 be SMAX.
>
> 4) Case:
>
> a) If SEQ is an ascending sequence generator, the current base
> value of SEQ is set to the value of the lowest non-issued value
> in the cycle.
>
> b) Otherwise, the current base value of SEQ is set to the highest
> non-issued value in the cycle.
>
> 5) V1 is returned as the RESULT.
>
> Conformance Rules
>
> None.
>
>
[mjs]
I believe the problem is in how you're interpreting clause 3):
"
3) If there exists a non-negative integer N such that SMIN <= CBV + N
* INC <= SMAX and the value (CBV + N * INC) has not already been
returned in the current cycle, then let V1 be (CBV + N *
INC). Otherwise, ...
"
-=-
It doesn't say what N is. That is to say...
Suppose you have a sequence 0,1,2,3,4 inserted so that the next number
should be 5. Yet suppose someone inserts a row with 5. Thus when you try to
use 5, you generate an error. In sub-section 3), N could =6 or any number >6
but less than the MAX value of an integer.
It can be interpreted that the sequence should attempt to generate N such
that it doesn't fail on the insert....
This is where the idea of selecting the MAX() value in the identity column
and incrementing it by one for the failed insert.
And that would be a compliant solution.
[SNIP]
> --
> Bernt Marius Johnsen, Database Technology Group,
> Staff Engineer, Technical Lead Derby/Java DB
> Sun Microsystems, Trondheim, Norway
[mjs]
-Mikey
PS,
Note: There are a couple of ways of implementing this logic so that you
don't have to trap for an insert error.
Hint: You could check on the insert and if the row contains a value in the
identity column, ICVal and ICVal > CBV, then after insert, CBV= ICVal;
This would ensure that if you ever manually insert a row and it is
successful, then the CBV (Current Base Value) will always be larger than the
max row.
Of course I'm assuming that the insert is atomic, however you could put a
mutex lock on the sequence while you're inserting a row where ICVal is >
than CBV ... that would solve the problem...
This actually would be the most "elegant" solution.
But hey! What do I know?
BTW, am I showing my age when I ask where you were when the Morris Worm hit
the net? ;-)