db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <mse...@segel.com>
Subject Re: "generated by default" question
Date Tue, 13 Jun 2006 19:07:35 GMT
On Tuesday 13 June 2006 1:01 pm, Craig L Russell wrote:
> Hi Mikey,
>
[SNIP]
> > [mjs] Hi Craig,
> >
> > Errr. No.
> > In short, the sequence generation is outside of the transaction,
> > therefore
> > its possible to get a jump in the sequence number due to transactions
> > rolling back or individual inserts failing due to additional
> > constraints.
>
> Not my point.
>
> > Please note that I'm assuming an incremental value of 1 to prove
> > the point.
> > If the sequence use a different value for its increment, then you
> > will have
> > to adjust the logic a bit.
> >
> > Here's the section 4.21 language:"
> >     When a row R is presented for insertion into
> >     BT, if R does not contain a column corresponding to IC, then
> > the value
> >     V for IC in the row inserted into BT is obtained by applying the
> >     General Rules of Subclause 9.21, "Generation of the next value
> > of a
> >     sequence generator", to SG."
>
> All this defines is the "default" instead of "always" behavior.
>
Yes, of course. What would happen if you inserted a row with your own value 
and the identity column is using the *always* behavior? Either you get an 
exception thrown,  or you ignore the input value and just generates the 
value.
If you're always generating the value, you can still get gaps, however, you'll 
never generate a value that is already in use unless you cycle.

> > Ok so you're looking at what they term is a "general rule". (Not
> > going to
> > define what *that* means.
> >
> > However looking at 9.21,
> >
> > 9.21 defines a mathematical statement. And the solution under 9.21
> > is really
> > a solution set. There a potential of (MAX - CBV)/INC elements (and
> > the empty
> > set) that could be applied where CBV < MAX.
>
> ok.
>
> > Your suggestion violates this because the number of possible values
> > for N
> > will be >= 1 iff (MAX-CBV)/INC >1.
>
> no. What I'm saying is that the sequence generator is defined to
> return all values between the start and max value without skipping
> any. And at the risk of repeating myself, this is behavior of the
> sequence generator not the constraint on the column values.
>
Uhm, no.
9.21 doesn't say that.
Look at the math.  
"3) If there exists a non-negative integer N such that SMIN <= CBV + N
       * INC <= SMAX and the value (CBV + N * INC) has not already been
       returned in the current cycle, then let V1 be (CBV + N *
       INC). Otherwise,
"

This is kind of important. It implies that you can not guarantee the order of 
V1 being returned. 

> > IF you return a sequence value N that meets the criteria, then
> > you're not
> > violating 9.21.
> >
> > Note again that this is a generalization and that implementations
> > of these
> > rules can vary.
> >
> > The first piece of the fix would be to do the following logic:
> >
> > 	If (row inserted contains a value R that corresponds to IC) {
> >             After Insert:
> >                 If ( R > CBV) Then CBV=R; // Or R+1;
> >       } else {
> >             CBV += INC; // INC = incremental value
> >       }
> >
> > Sorry for the bad pseudo code.
> >
> > The idea is that you'll always be setting CBV to the max value when
> > someone
> > inserts a row with their own identity value that is greater than
> > the current
> > base value.
>
> You still haven't responded to my main point:
> As I read this part of the specification, it refers to the generation
> of the sequence number, and not to the usage. The trick phrase is
> "the value (CBV + N * INC) has not already been returned in the
> current cycle". As I understand "returned" it means returned by the
> sequence generator, and nothing to do with the usage as a column value.
>
Right.
But my point is that N can be any number between CBV and SMAX (Assuming INC is 
1).
So you can get jumps in the value of the current cycle.

But yes, to your point, return does mean from the sequence generation.
(This is a gap in the spec. Unless there is more in the sections following 
4.21)

One of my earlier points in a post is that there are gaps on how to implement 
identity columns and sequences. So error correction is left to the 
implementor.

However, this is why I called this a bug/product defect/ design defect.

As I showed in my other post, Informix will not have a problem within the 
first cycle. It does however have a problem in additional cycles.

My take is that you have to create an elegant solution that would not have a 
negative impact on performance. Again, I believe that if you apply the 
solution I outlined earlier, this will help in the first cycle. And its a 
relatively quick fix. If you're going to allow for repeated cycles, then 
you're going to have to try some additional steps.


> Simply put, the sequence generator must return all legal values of
> SMIN < return-value < SMAX before reusing a return-value.
>
> > This is what Informix appears to be doing with their Serial value.
> >
> > Of course when you cycle over, then you have to worry about the
> > issue what
> > happens if the row already has a value in it. And that's the next
> > issue.
> >
> > Sorry, but this is still a bug. The current version of Derby does
> > violate
> > the specs as stated here.
>
> Sorry, not convinced.
>
Uhm, OK This is why databases will never be a commodity. ;-) 
(Its good that not everyone sees things the same way...)

I would suggest you get a mathematician to look at the language used.

If there exists a value N such that you can return V1 that doesn't throw an 
exception due to the unique constraint on the identity column, then the 
sequence should return that number. Or rather Derby should trap for this and 
determine how to find a value N if it exists. If N does not exist, then you 
should indicate that the table is "full".

Failing to do so would mean that there is a design defect that needs to be 
corrected.

I believe that we both agree that there are gaps in the spec.

The truly sad thing is that as a developer, unless I use auto generated 
identity columns, I can not count on Derby to correctly handle an insert.

Also note that I'm trying to be practical in that I'm not looking beyond the 
first cycle....

> Craig
>
> > Look at it this way. As long as a value N exists between the bounds
> > of CBV
> > and MAX, you will always have a sequence number.
> >
> > By following this logic, and going through a single cycle, you will
> > always
> > have either a value or an empty set when MAX - CBV < INC.
> >
> > -G
>
> Craig Russell
> Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
> 408 276-5638 mailto:Craig.Russell@sun.com
> P.S. A good JDO? O, Gasp!

Mime
View raw message