ofbiz-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James McGill <james.mcg...@ableengineering.com>
Subject Re: QuoteId and OrderId sequence, possible race condition?
Date Fri, 21 May 2010 17:44:34 GMT
On Fri, May 21, 2010 at 1:03 AM, Matt Warnock <
mwarnock@ridgecrestherbals.com> wrote:

> Adrian:
> I had been following it (with some interest) from the start, but went
> back and read it again.  Also reviewed the JIRA issue, which I had not
> previously.  I have also reviewed the patch you submitted, though I am
> not altogether certain what "special" invoice order sequencing means in
> this context, and in any event, I am no XML or ofbiz expert (yet).  But
> is seems that IF "special" sequencing is in effect, then this race
> condition occurs.

I will try to explain what we mean by "special" here.

The Entity delegator has a facility to do sequence generation, per entity.
That utility may or may not have a concurrency problem in distributed
environments.  In any case, that's not being used here.

Consider a scenario where you have two companies that you are doing invoices
for.  Call them Company "AA" and company "BB", and let these be the

Let us introduce a business rule that requires invoices to be numbered in
the sequence "AA1, AA2 ..." and "BB1, BB2 ...".

This rule is implemented by OFBiz:  In PartyAcctgPreference, keyed by
PartyId, there is a flag "INVSQ_ENF_SEQ".  I think that stands for "Invoice
Sequence Enforce Sequential".  In that PartyAcctgPreference, the sequence
number for that Party is stored as "lastInvoiceNumber".

Let's walk through the single-thread scenario for createInvoice.

The service opens a transaction.  The first thing it does is call
createInvoiceId.  createInvoiceId also opens a transaction.  It looks up
PartyAcctgPreference for "AA".  It checks INVSQ_ENF_SEQ.  Seeing "true", it
then reads lastInvoiceNumber, increments it by one, writes it back to the
Entity, returns this value to createInvoice, committing its transaction.
createInvoiceId writes the Invoice and commits its transaction.   Everything
is fine in a single thread, of course.

In multiple threads, we see two threads  reading the same lastInvoiceNumber
for a party, even though createInvoiceId and createInvoice each runs under a
transaction.  To add to the puzzle, we see this happen even when those two
threads are more than a few seconds apart (an eternity).

I'm still not convinced that putting createInvoiceId in a synchronized
section will solve this problem, because the service already uses a
transaction, which means it is already treated as a critical section.   I
think that createInvoice itself should be exclusive to the thread as well,
since it looks like two threads enter createInvoice, then they both get the
same invoiceId from createInvoiceId even though that shouldn't be possible
because of the transaction.

Once this happens, it becomes impossible to use the system until
lastInvoiceNumber is incremented for the party.  I'm trying to write a test
case that triggers the bug so that we can show it's fixed by adding
synchronized to the minilang.

Now, MOST businesses will require that every invoice number is unique,
> and MANY will further require that all order/invoice/check numbers be
> accounted for, even on prenumbered forms (no "gaps") for auditing
> purposes.  Is that "special" to ofbiz?

No.  At the risk of being repetitive, consider the case with multiple
parties, each of which needs sequential invoices.  If the PK of Invoice were
simply a serial number, the normal sequence generator would work. But if we
need a serial number *per party*, then we need to generate a sequence per
party and use the partyId as a prefix.  (At least this is the way OFBiz does
it).   So this way, company AA can have invoiceId AA1, and company BB can
have invoiceId BB1, and these can each be used as a PK in the same column.

Alternatively, some businesses I have seen enforce a particular format
> on the invoice/order/document number (like OOOYYYYMMDDSSS, where OOO is
> the office number, followed by a date, and SSS is a sequence number
> within the day), and maybe that is where the trouble lies.  In the US,
> Social Security numbers are generated on a "special" pattern like this.

You would probably use "referenceNumber" for something like this.  Your
"SSS" component of your field would get you to the same place as the bug
we're discussing, given a partyId of "OOO".

You need an thread-safe/atomic method of generating SSS.  Even though
getInvoiceId runs in a transaction, two threads get the same SSS.

> In either event, I remain convinced, and on reviewing the thread, David
> Jones seems to agree, that this problem cannot be easily solved except
> with atomic SQL transactions.  Is there some reason they are not being
> used in this situation?  Or are they in use, and is the OP using a
> REALLY broken JDBC?


> Or are they in use, but a record write has
> inadvertently snuck outside the transaction?

That's what I'm wondering.

At our company, we have two people putting orders in all morning (and we
> are a very small company), and the chance of a collision in your
> scenario is NOT rare, but quite high indeed, as the OP has already
> pointed out, even on a single server with multiple users.

If you only generate them for a single PartyID and you don't have
you won't ever hit this part of the service.

> The issue is not that the sequence generator is "slow" to store the new
> invoice number, but rather that "last ID" read, the "new ID" generator,
> and the "new ID" write are not ALL encased inside a transaction that
> will ensure that the database is ALWAYS internally consistent.  The
> "race" occurs because there is no enclosing transaction, and the "race"
> will never be completely controlled by being "fast" enough.

But there supposedly is an enclosing transaction.  The services run in a
transaction by default, and just to be sure, I've explicitly set
use-transaction="true" for both.  So it's like this:


It looks like the right idea to me, but it ends up not working, and I
haven't been smart enough (so far) to figure out why.

> If there is a long lag between read and write (as where a user is
> entering order line items) then you either have to 1) do the
> read/increment/write at the END of the data entry process, or 2) face
> the high likelihood that two entry processes will collide, and one of
> them will have to roll back and then be renumbered.

I don't see a way to do try/retry looping in minilang.   I think you'd need
an ECA to handle the error condition, but I think there's nothing wrong with
Java (to sum up how I feel about minilang in general.)

> Most software I have seen is not very bright about this, and it reads
> and generates the new order number first, displaying it on the input
> screen as data is entered.

Please look at createInvoice and how it gets the Id.  There's no UI event
involved, and since services run in a transaction by default, it doesn't
really look like the design is wrong.  I might not even be right about the
source of the problem.  But it breaks about once a week, on both Quote and
Invoice (which use the same approach) and the log definitely shows two
threads getting the same lastId.

Whether you use a SQL standard "sequence" or not (or even the mysql
> "autoincrement" abomination), and regardless of the primary key status
> of the orderID (any SQL "unique" declaration could have the same effect,
> and a well designed order table should at LEAST have a unique orderID,
> whether it is part of the primary key or not), SQL transactions should
> resolve this issue easily and scalably.
> On the other hand, using locking or semaphores in the application code
> is reinventing the SQL wheel and fraught with possible error.  As David
> suggested, there may be multiple front ends talking to a single
> database, so a SQL solution is best.  I am merely respectfully
> suggesting that it would be better to let SQL do what it does best, and
> atomic transactions are the essence of good SQL.
> Hope this helps explain where I am coming from.

I hope you will look at the code in question and I hope you can explain why
transactions haven't worked.
MySQL5,  isolation-level="ReadCommitted".  I believe OFBiz is already taking
the approach you suggest, relying on SQL transactions to protect a critical
section, and it's not working for some reason.

Any or all of my assumptions could be wrong.  I just need to fix this so it
stops shutting us down.

James McGill
Phoenix AZ

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message