From derby-user-return-4409-apmail-db-derby-user-archive=db.apache.org@db.apache.org Tue Jun 13 19:03:18 2006
Return-Path:
Delivered-To: apmail-db-derby-user-archive@www.apache.org
Received: (qmail 95911 invoked from network); 13 Jun 2006 19:03:18 -0000
Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199)
by minotaur.apache.org with SMTP; 13 Jun 2006 19:03:18 -0000
Received: (qmail 83734 invoked by uid 500); 13 Jun 2006 19:03:17 -0000
Delivered-To: apmail-db-derby-user-archive@db.apache.org
Received: (qmail 83574 invoked by uid 500); 13 Jun 2006 19:03:16 -0000
Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm
Precedence: bulk
list-help:
list-unsubscribe:
List-Post:
List-Id:
Reply-To: "Derby Discussion"
Delivered-To: mailing list derby-user@db.apache.org
Received: (qmail 83563 invoked by uid 99); 13 Jun 2006 19:03:16 -0000
Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jun 2006 12:03:16 -0700
X-ASF-Spam-Status: No, hits=0.0 required=10.0
tests=
X-Spam-Check-By: apache.org
Received-SPF: pass (asf.osuosl.org: local policy)
Received: from [65.195.181.55] (HELO dbrack01.segel.com) (65.195.181.55)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jun 2006 12:03:15 -0700
Received: from dbrack01.segel.com (dbrack01.segel.com [65.195.181.55])
by dbrack01.segel.com (Postfix - We shoot spammers on site.) with ESMTP id EDAAA11F9A
for ; Tue, 13 Jun 2006 14:07:35 -0500 (CDT)
From: Michael Segel
Reply-To: msegel@segel.com
Organization: MSCC
To: "Derby Discussion"
Subject: Re: "generated by default" question
Date: Tue, 13 Jun 2006 14:07:35 -0500
User-Agent: KMail/1.8.2
References: <20060613174919.A814225181@dbrack01.segel.com> <57FCE110-EE5E-4261-91A7-9D80ACAB17DE@SUN.com>
In-Reply-To: <57FCE110-EE5E-4261-91A7-9D80ACAB17DE@SUN.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
Message-Id: <200606131407.35720.msegel@segel.com>
X-Virus-Checked: Checked by ClamAV on apache.org
X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N
On Tuesday 13 June 2006 1:01 pm, Craig L Russell wrote:
> Hi Mikey,
>
[SNIP]
> > [mjs] Hi Craig,
> >
> > Errr. No.
> > In short, the sequence generation is outside of the transaction,
> > therefore
> > its possible to get a jump in the sequence number due to transactions
> > rolling back or individual inserts failing due to additional
> > constraints.
>
> Not my point.
>
> > Please note that I'm assuming an incremental value of 1 to prove
> > the point.
> > If the sequence use a different value for its increment, then you
> > will have
> > to adjust the logic a bit.
> >
> > Here's the section 4.21 language:"
> > When a row R is presented for insertion into
> > BT, if R does not contain a column corresponding to IC, then
> > the value
> > V for IC in the row inserted into BT is obtained by applying the
> > General Rules of Subclause 9.21, "Generation of the next value
> > of a
> > sequence generator", to SG."
>
> All this defines is the "default" instead of "always" behavior.
>
Yes, of course. What would happen if you inserted a row with your own value=
=20
and the identity column is using the *always* behavior? Either you get an=20
exception thrown, or you ignore the input value and just generates the=20
value.
If you're always generating the value, you can still get gaps, however, you=
'll=20
never generate a value that is already in use unless you cycle.
> > Ok so you're looking at what they term is a "general rule". (Not
> > going to
> > define what *that* means.
> >
> > However looking at 9.21,
> >
> > 9.21 defines a mathematical statement. And the solution under 9.21
> > is really
> > a solution set. There a potential of (MAX - CBV)/INC elements (and
> > the empty
> > set) that could be applied where CBV < MAX.
>
> ok.
>
> > Your suggestion violates this because the number of possible values
> > for N
> > will be >=3D 1 iff (MAX-CBV)/INC >1.
>
> no. What I'm saying is that the sequence generator is defined to
> return all values between the start and max value without skipping
> any. And at the risk of repeating myself, this is behavior of the
> sequence generator not the constraint on the column values.
>
Uhm, no.
9.21 doesn't say that.
Look at the math. =20
"3) If there exists a non-negative integer N such that SMIN <=3D CBV + N
=C2=A0 =C2=A0 =C2=A0 =C2=A0* INC <=3D SMAX and the value (CBV + N * INC) ha=
s not already been
=C2=A0 =C2=A0 =C2=A0 =C2=A0returned in the current cycle, then let V1 be (C=
BV + N *
=C2=A0 =C2=A0 =C2=A0 =C2=A0INC). Otherwise,
"
This is kind of important. It implies that you can not guarantee the order =
of=20
V1 being returned.=20
> > IF you return a sequence value N that meets the criteria, then
> > you're not
> > violating 9.21.
> >
> > Note again that this is a generalization and that implementations
> > of these
> > rules can vary.
> >
> > The first piece of the fix would be to do the following logic:
> >
> > If (row inserted contains a value R that corresponds to IC) {
> > After Insert:
> > If ( R > CBV) Then CBV=3DR; // Or R+1;
> > } else {
> > CBV +=3D INC; // INC =3D incremental value
> > }
> >
> > Sorry for the bad pseudo code.
> >
> > The idea is that you'll always be setting CBV to the max value when
> > someone
> > inserts a row with their own identity value that is greater than
> > the current
> > base value.
>
> You still haven't responded to my main point:
> As I read this part of the specification, it refers to the generation
> of the sequence number, and not to the usage. The trick phrase is
> "the value (CBV + N * INC) has not already been returned in the
> current cycle". As I understand "returned" it means returned by the
> sequence generator, and nothing to do with the usage as a column value.
>
Right.
But my point is that N can be any number between CBV and SMAX (Assuming INC=
is=20
1).
So you can get jumps in the value of the current cycle.
But yes, to your point, return does mean from the sequence generation.
(This is a gap in the spec. Unless there is more in the sections following=
=20
4.21)
One of my earlier points in a post is that there are gaps on how to impleme=
nt=20
identity columns and sequences. So error correction is left to the=20
implementor.
However, this is why I called this a bug/product defect/ design defect.
As I showed in my other post, Informix will not have a problem within the=20
first cycle. It does however have a problem in additional cycles.
My take is that you have to create an elegant solution that would not have =
a=20
negative impact on performance. Again, I believe that if you apply the=20
solution I outlined earlier, this will help in the first cycle. And its a=20
relatively quick fix. If you're going to allow for repeated cycles, then=20
you're going to have to try some additional steps.
> Simply put, the sequence generator must return all legal values of
> SMIN < return-value < SMAX before reusing a return-value.
>
> > This is what Informix appears to be doing with their Serial value.
> >
> > Of course when you cycle over, then you have to worry about the
> > issue what
> > happens if the row already has a value in it. And that's the next
> > issue.
> >
> > Sorry, but this is still a bug. The current version of Derby does
> > violate
> > the specs as stated here.
>
> Sorry, not convinced.
>
Uhm, OK This is why databases will never be a commodity. ;-)=20
(Its good that not everyone sees things the same way...)
I would suggest you get a mathematician to look at the language used.
If there exists a value N such that you can return V1 that doesn't throw an=
=20
exception due to the unique constraint on the identity column, then the=20
sequence should return that number. Or rather Derby should trap for this an=
d=20
determine how to find a value N if it exists. If N does not exist, then you=
=20
should indicate that the table is "full".
=46ailing to do so would mean that there is a design defect that needs to b=
e=20
corrected.
I believe that we both agree that there are gaps in the spec.
The truly sad thing is that as a developer, unless I use auto generated=20
identity columns, I can not count on Derby to correctly handle an insert.
Also note that I'm trying to be practical in that I'm not looking beyond th=
e=20
first cycle....
> Craig
>
> > Look at it this way. As long as a value N exists between the bounds
> > of CBV
> > and MAX, you will always have a sequence number.
> >
> > By following this logic, and going through a single cycle, you will
> > always
> > have either a value or an empty set when MAX - CBV < INC.
> >
> > -G
>
> Craig Russell
> Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
> 408 276-5638 mailto:Craig.Russell@sun.com
> P.S. A good JDO? O, Gasp!