QuoteId and OrderId sequence, possible race condition?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: QuoteId and OrderId sequence, possible race condition?

Adrian Crum-2
--- On Fri, 5/21/10, Matt Warnock <[hidden email]> wrote:

> Adrian:
>
> I had been following it (with some interest) from the
> start, but went
> back and read it again.  Also reviewed the JIRA issue,
> which I had not
> previously.  I have also reviewed the patch you
> submitted, though I am
> not altogether certain what "special" invoice order
> sequencing means in
> this context, and in any event, I am no XML or ofbiz expert
> (yet).  But
> is seems that IF "special" sequencing is in effect, then
> this race
> condition occurs.

The special sequencing can be found in the simple methods that are changed in the patch.

> Now, MOST businesses will require that every invoice number
> is unique,
> and MANY will further require that all order/invoice/check
> numbers be
> accounted for, even on prenumbered forms (no "gaps") for
> auditing
> purposes.  Is that "special" to ofbiz? 
>
> Alternatively, some businesses I have seen enforce a
> particular format
> on the invoice/order/document number (like OOOYYYYMMDDSSS,
> where OOO is
> the office number, followed by a date, and SSS is a
> sequence number
> within the day), and maybe that is where the trouble
> lies.  In the US,
> Social Security numbers are generated on a "special"
> pattern like this.
>
> In either event, I remain convinced, and on reviewing the
> thread, David
> Jones seems to agree, that this problem cannot be easily
> solved except
> with atomic SQL transactions.  Is there some reason
> they are not being
> used in this situation?  Or are they in use, and is
> the OP using a
> REALLY broken JDBC? Or are they in use, but a record write
> has
> inadvertently snuck outside the transaction?

No, atomic transactions are not being used in this case. The invoice/order ID generation code is in a simple method, and simple methods default to having their own transaction. So, the ID can be generated successfully, but the invoice/order creation could fail.

> At our company, we have two people putting orders in all
> morning (and we
> are a very small company), and the chance of a collision in
> your
> scenario is NOT rare, but quite high indeed, as the OP has
> already
> pointed out, even on a single server with multiple
> users.  But we would
> only find this out after rollout, not during single-user
> testing, and it
> would be an apparently random system lockup, which would be
> frustrating
> to say the least.  Kudos to the OP for spotting the
> real nature of the
> problem-- race conditions are often hard to identify. 
> :-)
>
> The issue is not that the sequence generator is "slow" to
> store the new
> invoice number, but rather that "last ID" read, the "new
> ID" generator,
> and the "new ID" write are not ALL encased inside a
> transaction that
> will ensure that the database is ALWAYS internally
> consistent.  The
> "race" occurs because there is no enclosing transaction,
> and the "race"
> will never be completely controlled by being "fast"
> enough.
>
> If there is a long lag between read and write (as where a
> user is
> entering order line items) then you either have to 1) do
> the
> read/increment/write at the END of the data entry process,
> or 2) face
> the high likelihood that two entry processes will collide,
> and one of
> them will have to roll back and then be renumbered. 
>
> Most software I have seen is not very bright about this,
> and it reads
> and generates the new order number first, displaying it on
> the input
> screen as data is entered.  That is not the
> SQL-friendly way, and asks
> for trouble.  Users like it, because they often need
> to write that
> number down on the data entry form--  they just need
> to be trained to
> get that order ID as the last thing they do. :(  I'm
> strongly hoping
> ofbiz will conform to better practices here, as it has
> elsewhere.
>
> Whether you use a SQL standard "sequence" or not (or even
> the mysql
> "autoincrement" abomination), and regardless of the primary
> key status
> of the orderID (any SQL "unique" declaration could have the
> same effect,
> and a well designed order table should at LEAST have a
> unique orderID,
> whether it is part of the primary key or not), SQL
> transactions should
> resolve this issue easily and scalably. 
>
> On the other hand, using locking or semaphores in the
> application code
> is reinventing the SQL wheel and fraught with possible
> error.  As David
> suggested, there may be multiple front ends talking to a
> single
> database, so a SQL solution is best.  I am merely
> respectfully
> suggesting that it would be better to let SQL do what it
> does best, and
> atomic transactions are the essence of good SQL. 
>
> Hope this helps explain where I am coming from.
>
> --
> Matt Warnock <[hidden email]>
> RidgeCrest Herbals, Inc.
>
> On Thu, 2010-05-20 at 22:26 -0700, Adrian Crum wrote:
> > Matt,
> >
> > If you follow the thread from the start and visit the
> Jira issue, you will see that we are discussing a very
> specific problem - not transactions or sequences in
> general.
> >
> > To summarize: There is code in OFBiz that generates
> sequential ID numbers for orders and invoices. That code
> does not follow the established pattern for generating
> sequences, and as a result it will produce non-sequential
> IDs. A Jira issue has been created and we are working on the
> problem.
> >
> > -Adrian
> >
> > --- On Thu, 5/20/10, Matt Warnock <[hidden email]>
> wrote:
> >
> > > From: Matt Warnock <[hidden email]>
> > > Subject: Re: QuoteId and OrderId sequence,
> possible race condition?
> > > To: [hidden email]
> > > Date: Thursday, May 20, 2010, 5:15 PM
> > > Isn't this exactly the kind of thing
> > > that SQL transactions, sequences,
> > > and foreign key constraints are designed to
> solve? 
> > >
> > > If the thread is using transactions properly, the
> whole
> > > transaction
> > > should either be rolled back, or completed. 
> Either
> > > way, the database
> > > SHOULD be in a wholly consistent state.  But
> it
> > > obviously isn't.
> > >
> > > So either we are doing something outside a
> transaction,
> > > that really
> > > should be done INSIDE a transaction, or the
> database has a
> > > seriously
> > > flawed implementation of transactions.
> > > --
> > > Matt Warnock <[hidden email]>
> > > RidgeCrest Herbals, Inc.
> > >
> > > On Thu, 2010-05-20 at 12:43 -0700, Adrian Crum
> wrote:
> > > > I doubt there will be any need for
> additional
> > > synchronization.
> > > >
> > > > I think I know what's going on in your
> system. My
> > > previous scenario
> > > > wasn't correct because I thought the ID was
> a part of
> > > the primary key
> > > > and there was an exception thrown while
> storing the
> > > new value - due to a
> > > > duplicate key violation. Those IDs are not a
> part of
> > > the primary key, so
> > > > that isn't the cause.
> > > >
> > > > Instead, I believe the ID is updated
> successfully, but
> > > a thread is slow
> > > > in storing the new ID value - which ends up
> > > decrementing the ID in the
> > > > database instead of incrementing it. Then
> the invoice
> > > generation fails
> > > > due to a duplicate key violation. From that
> point on,
> > > every attempt to
> > > > increment the ID value and create a new
> invoice with
> > > it will fail
> > > > because the new ID value already exists in
> the invoice
> > > file.
> > > >
> > > > -Adrian
> > > >
> > > > On 5/20/2010 11:57 AM, James McGill wrote:
> > > > > On Thu, May 20, 2010 at 10:46 AM,
> Adrian
> > > Crum<[hidden email]>
>
> > > wrote:
> > > > >
> > > > >> James,
> > > > >>
> > > > >> I uploaded a patch that might fix
> your
> > > problem:
> > > > >>
> > > > >>
> > > > > Thanks Adrian.  Your approach adds
> JVM-based
> > > synchronization to minilang
> > > > > simple methods.
> > > > >
> > > > > I still wonder if this means the create
> services
> > > need to be synchronized and
> > > > > not just the id generators.
> > > > >
> > > > > I'll try it out and keep you posted.
> > > > >
> > > > >
> > > > > James
> > > > >
> > >
> > >
> >
> >
> >       
>
>



Reply | Threaded
Open this post in threaded view
|

Re: QuoteId and OrderId sequence, possible race condition?

Matt Warnock
> On Fri, 2010-05-21 at 06:50 -0700, Adrian Crum wrote:
> No, atomic transactions are not being used in this case. The
>  invoice/order ID generation code is in a simple method, and simple
>  methods default to having their own transaction. So, the ID can be
>  generated successfully, but the invoice/order creation could fail.
>

Then the problem is that the actual data write is occurring outside the
simple method.  There are two options I can see.  Sorry for the
pseudocode, but you'll get the idea.

1) refactor the code so that the simple method encloses the document write:

simplemethod writeCustomDocument (customDocumentObject) {
        oldID = getMaxOldID();
        DocumentObject.ID = generateNewID(oldID);
        writeDocument(DocumentObject);
}

In this case, both getMaxOldID and GenerateNewID probably need to be
hooks to customizable code (or you just replace the whole simple
method, using the original as a template), since a custom ID generator
sequence may not be able to rely on generic data comparison tests to
determine the 'highest' ID already in the table.

2) You could write a largely empty record (only the newID is required),
and save the real data content later.  Once the ID is saved, there's no
risk of collision, but then you need to wonder what you do if the write
is abandoned before the real data ever gets written.  You may end up
with a bunch of mostly-empty records.  For the reasons outlined
previously, I think option 1) is better.

Nicholas Malin also raised some good points about multitenancy-- that
also complicates matters, but that probably can be encapsulated within
the custom code, I think, or a call to a generic method or SQL fragment
(think WHERE clause) that filters out documents from other tenants.

--
Matt Warnock <[hidden email]>
RidgeCrest Herbals, Inc.

> --- On Fri, 5/21/10, Matt Warnock <[hidden email]> wrote:
> > Adrian:
> >
> > I had been following it (with some interest) from the
> > start, but went
> > back and read it again.  Also reviewed the JIRA issue,
> > which I had not
> > previously.  I have also reviewed the patch you
> > submitted, though I am
> > not altogether certain what "special" invoice order
> > sequencing means in
> > this context, and in any event, I am no XML or ofbiz expert
> > (yet).  But
> > is seems that IF "special" sequencing is in effect, then
> > this race
> > condition occurs.
>
> The special sequencing can be found in the simple methods that are changed in the patch.
>
> > Now, MOST businesses will require that every invoice number
> > is unique,
> > and MANY will further require that all order/invoice/check
> > numbers be
> > accounted for, even on prenumbered forms (no "gaps") for
> > auditing
> > purposes.  Is that "special" to ofbiz?  
> >
> > Alternatively, some businesses I have seen enforce a
> > particular format
> > on the invoice/order/document number (like OOOYYYYMMDDSSS,
> > where OOO is
> > the office number, followed by a date, and SSS is a
> > sequence number
> > within the day), and maybe that is where the trouble
> > lies.  In the US,
> > Social Security numbers are generated on a "special"
> > pattern like this.
> >
> > In either event, I remain convinced, and on reviewing the
> > thread, David
> > Jones seems to agree, that this problem cannot be easily
> > solved except
> > with atomic SQL transactions.  Is there some reason
> > they are not being
> > used in this situation?  Or are they in use, and is
> > the OP using a
> > REALLY broken JDBC? Or are they in use, but a record write
> > has
> > inadvertently snuck outside the transaction?

> No, atomic transactions are not being used in this case. The
>  invoice/order ID generation code is in a simple method, and simple
>  methods default to having their own transaction. So, the ID can be
>  generated successfully, but the invoice/order creation could fail.
>
> > At our company, we have two people putting orders in all
> > morning (and we
> > are a very small company), and the chance of a collision in
> > your
> > scenario is NOT rare, but quite high indeed, as the OP has
> > already
> > pointed out, even on a single server with multiple
> > users.  But we would
> > only find this out after rollout, not during single-user
> > testing, and it
> > would be an apparently random system lockup, which would be
> > frustrating
> > to say the least.  Kudos to the OP for spotting the
> > real nature of the
> > problem-- race conditions are often hard to identify.
> > :-)
> >
> > The issue is not that the sequence generator is "slow" to
> > store the new
> > invoice number, but rather that "last ID" read, the "new
> > ID" generator,
> > and the "new ID" write are not ALL encased inside a
> > transaction that
> > will ensure that the database is ALWAYS internally
> > consistent.  The
> > "race" occurs because there is no enclosing transaction,
> > and the "race"
> > will never be completely controlled by being "fast"
> > enough.
> >
> > If there is a long lag between read and write (as where a
> > user is
> > entering order line items) then you either have to 1) do
> > the
> > read/increment/write at the END of the data entry process,
> > or 2) face
> > the high likelihood that two entry processes will collide,
> > and one of
> > them will have to roll back and then be renumbered.  
> >
> > Most software I have seen is not very bright about this,
> > and it reads
> > and generates the new order number first, displaying it on
> > the input
> > screen as data is entered.  That is not the
> > SQL-friendly way, and asks
> > for trouble.  Users like it, because they often need
> > to write that
> > number down on the data entry form--  they just need
> > to be trained to
> > get that order ID as the last thing they do. :(  I'm
> > strongly hoping
> > ofbiz will conform to better practices here, as it has
> > elsewhere.
> >
> > Whether you use a SQL standard "sequence" or not (or even
> > the mysql
> > "autoincrement" abomination), and regardless of the primary
> > key status
> > of the orderID (any SQL "unique" declaration could have the
> > same effect,
> > and a well designed order table should at LEAST have a
> > unique orderID,
> > whether it is part of the primary key or not), SQL
> > transactions should
> > resolve this issue easily and scalably.  
> >
> > On the other hand, using locking or semaphores in the
> > application code
> > is reinventing the SQL wheel and fraught with possible
> > error.  As David
> > suggested, there may be multiple front ends talking to a
> > single
> > database, so a SQL solution is best.  I am merely
> > respectfully
> > suggesting that it would be better to let SQL do what it
> > does best, and
> > atomic transactions are the essence of good SQL.  
> >
> > Hope this helps explain where I am coming from.
> >
> > --
> > Matt Warnock <[hidden email]>
> > RidgeCrest Herbals, Inc.
> >
> > On Thu, 2010-05-20 at 22:26 -0700, Adrian Crum wrote:
> > > Matt,
> > >
> > > If you follow the thread from the start and visit the
> > Jira issue, you will see that we are discussing a very
> > specific problem - not transactions or sequences in
> > general.
> > >
> > > To summarize: There is code in OFBiz that generates
> > sequential ID numbers for orders and invoices. That code
> > does not follow the established pattern for generating
> > sequences, and as a result it will produce non-sequential
> > IDs. A Jira issue has been created and we are working on the
> > problem.
> > >
> > > -Adrian
> > >
> > > --- On Thu, 5/20/10, Matt Warnock <[hidden email]>
> > wrote:
> > >
> > > > From: Matt Warnock <[hidden email]>
> > > > Subject: Re: QuoteId and OrderId sequence,
> > possible race condition?
> > > > To: [hidden email]
> > > > Date: Thursday, May 20, 2010, 5:15 PM
> > > > Isn't this exactly the kind of thing
> > > > that SQL transactions, sequences,
> > > > and foreign key constraints are designed to
> > solve?  
> > > >
> > > > If the thread is using transactions properly, the
> > whole
> > > > transaction
> > > > should either be rolled back, or completed.
> > Either
> > > > way, the database
> > > > SHOULD be in a wholly consistent state.  But
> > it
> > > > obviously isn't.
> > > >
> > > > So either we are doing something outside a
> > transaction,
> > > > that really
> > > > should be done INSIDE a transaction, or the
> > database has a
> > > > seriously
> > > > flawed implementation of transactions.
> > > > --
> > > > Matt Warnock <[hidden email]>
> > > > RidgeCrest Herbals, Inc.
> > > >
> > > > On Thu, 2010-05-20 at 12:43 -0700, Adrian Crum
> > wrote:
> > > > > I doubt there will be any need for
> > additional
> > > > synchronization.
> > > > >
> > > > > I think I know what's going on in your
> > system. My
> > > > previous scenario
> > > > > wasn't correct because I thought the ID was
> > a part of
> > > > the primary key
> > > > > and there was an exception thrown while
> > storing the
> > > > new value - due to a
> > > > > duplicate key violation. Those IDs are not a
> > part of
> > > > the primary key, so
> > > > > that isn't the cause.
> > > > >
> > > > > Instead, I believe the ID is updated
> > successfully, but
> > > > a thread is slow
> > > > > in storing the new ID value - which ends up
> > > > decrementing the ID in the
> > > > > database instead of incrementing it. Then
> > the invoice
> > > > generation fails
> > > > > due to a duplicate key violation. From that
> > point on,
> > > > every attempt to
> > > > > increment the ID value and create a new
> > invoice with
> > > > it will fail
> > > > > because the new ID value already exists in
> > the invoice
> > > > file.
> > > > >
> > > > > -Adrian
> > > > >
> > > > > On 5/20/2010 11:57 AM, James McGill wrote:
> > > > > > On Thu, May 20, 2010 at 10:46 AM,
> > Adrian
> > > > Crum<[hidden email]>
> >
> > > > wrote:
> > > > > >
> > > > > >> James,
> > > > > >>
> > > > > >> I uploaded a patch that might fix
> > your
> > > > problem:
> > > > > >>
> > > > > >>
> > > > > > Thanks Adrian.  Your approach adds
> > JVM-based
> > > > synchronization to minilang
> > > > > > simple methods.
> > > > > >
> > > > > > I still wonder if this means the create
> > services
> > > > need to be synchronized and
> > > > > > not just the id generators.
> > > > > >
> > > > > > I'll try it out and keep you posted.
> > > > > >
> > > > > >
> > > > > > James
> > > > > >
> > > >
> > > >
> > >
> > >
> > >      
> >
> >
>
>
>      

Reply | Threaded
Open this post in threaded view
|

Re: QuoteId and OrderId sequence, possible race condition?

James McGill-5
In reply to this post by Matt Warnock
On Fri, May 21, 2010 at 1:03 AM, Matt Warnock <
[hidden email]> wrote:

> Adrian:
>
> I had been following it (with some interest) from the start, but went
> back and read it again.  Also reviewed the JIRA issue, which I had not
> previously.  I have also reviewed the patch you submitted, though I am
> not altogether certain what "special" invoice order sequencing means in
> this context, and in any event, I am no XML or ofbiz expert (yet).  But
> is seems that IF "special" sequencing is in effect, then this race
> condition occurs.
>

I will try to explain what we mean by "special" here.

The Entity delegator has a facility to do sequence generation, per entity.
That utility may or may not have a concurrency problem in distributed
environments.  In any case, that's not being used here.

Consider a scenario where you have two companies that you are doing invoices
for.  Call them Company "AA" and company "BB", and let these be the
partyIds.

Let us introduce a business rule that requires invoices to be numbered in
the sequence "AA1, AA2 ..." and "BB1, BB2 ...".

This rule is implemented by OFBiz:  In PartyAcctgPreference, keyed by
PartyId, there is a flag "INVSQ_ENF_SEQ".  I think that stands for "Invoice
Sequence Enforce Sequential".  In that PartyAcctgPreference, the sequence
number for that Party is stored as "lastInvoiceNumber".

Let's walk through the single-thread scenario for createInvoice.
applications/accounting/script/org/ofbiz/accounting/invoice/InvoiceServices.xml#createInvoice

The service opens a transaction.  The first thing it does is call
createInvoiceId.  createInvoiceId also opens a transaction.  It looks up
PartyAcctgPreference for "AA".  It checks INVSQ_ENF_SEQ.  Seeing "true", it
then reads lastInvoiceNumber, increments it by one, writes it back to the
Entity, returns this value to createInvoice, committing its transaction.
createInvoiceId writes the Invoice and commits its transaction.   Everything
is fine in a single thread, of course.

In multiple threads, we see two threads  reading the same lastInvoiceNumber
for a party, even though createInvoiceId and createInvoice each runs under a
transaction.  To add to the puzzle, we see this happen even when those two
threads are more than a few seconds apart (an eternity).

I'm still not convinced that putting createInvoiceId in a synchronized
section will solve this problem, because the service already uses a
transaction, which means it is already treated as a critical section.   I
think that createInvoice itself should be exclusive to the thread as well,
since it looks like two threads enter createInvoice, then they both get the
same invoiceId from createInvoiceId even though that shouldn't be possible
because of the transaction.

Once this happens, it becomes impossible to use the system until
lastInvoiceNumber is incremented for the party.  I'm trying to write a test
case that triggers the bug so that we can show it's fixed by adding
synchronized to the minilang.



Now, MOST businesses will require that every invoice number is unique,
> and MANY will further require that all order/invoice/check numbers be
> accounted for, even on prenumbered forms (no "gaps") for auditing
> purposes.  Is that "special" to ofbiz?
>

No.  At the risk of being repetitive, consider the case with multiple
parties, each of which needs sequential invoices.  If the PK of Invoice were
simply a serial number, the normal sequence generator would work. But if we
need a serial number *per party*, then we need to generate a sequence per
party and use the partyId as a prefix.  (At least this is the way OFBiz does
it).   So this way, company AA can have invoiceId AA1, and company BB can
have invoiceId BB1, and these can each be used as a PK in the same column.


Alternatively, some businesses I have seen enforce a particular format
> on the invoice/order/document number (like OOOYYYYMMDDSSS, where OOO is
> the office number, followed by a date, and SSS is a sequence number
> within the day), and maybe that is where the trouble lies.  In the US,
> Social Security numbers are generated on a "special" pattern like this.
>

You would probably use "referenceNumber" for something like this.  Your
"SSS" component of your field would get you to the same place as the bug
we're discussing, given a partyId of "OOO".

You need an thread-safe/atomic method of generating SSS.  Even though
getInvoiceId runs in a transaction, two threads get the same SSS.


> In either event, I remain convinced, and on reviewing the thread, David
> Jones seems to agree, that this problem cannot be easily solved except
> with atomic SQL transactions.  Is there some reason they are not being
> used in this situation?  Or are they in use, and is the OP using a
> REALLY broken JDBC?


mysql-connector-java-5.1.8-bin.jar


> Or are they in use, but a record write has
> inadvertently snuck outside the transaction?
>

That's what I'm wondering.

At our company, we have two people putting orders in all morning (and we
> are a very small company), and the chance of a collision in your
> scenario is NOT rare, but quite high indeed, as the OP has already
> pointed out, even on a single server with multiple users.


If you only generate them for a single PartyID and you don't have
orderSequenceEnumId=ODRSQ_ENF_SEQ
you won't ever hit this part of the service.


> The issue is not that the sequence generator is "slow" to store the new
> invoice number, but rather that "last ID" read, the "new ID" generator,
> and the "new ID" write are not ALL encased inside a transaction that
> will ensure that the database is ALWAYS internally consistent.  The
> "race" occurs because there is no enclosing transaction, and the "race"
> will never be completely controlled by being "fast" enough.
>

But there supposedly is an enclosing transaction.  The services run in a
transaction by default, and just to be sure, I've explicitly set
use-transaction="true" for both.  So it's like this:

begin(Tx1)
  createInvoice
  begin(Tx2)
    getInvoiceId
  commit(Tx2)
commit(Tx1)

It looks like the right idea to me, but it ends up not working, and I
haven't been smart enough (so far) to figure out why.


> If there is a long lag between read and write (as where a user is
> entering order line items) then you either have to 1) do the
> read/increment/write at the END of the data entry process, or 2) face
> the high likelihood that two entry processes will collide, and one of
> them will have to roll back and then be renumbered.
>

I don't see a way to do try/retry looping in minilang.   I think you'd need
an ECA to handle the error condition, but I think there's nothing wrong with
Java (to sum up how I feel about minilang in general.)


> Most software I have seen is not very bright about this, and it reads
> and generates the new order number first, displaying it on the input
> screen as data is entered.


Please look at createInvoice and how it gets the Id.  There's no UI event
involved, and since services run in a transaction by default, it doesn't
really look like the design is wrong.  I might not even be right about the
source of the problem.  But it breaks about once a week, on both Quote and
Invoice (which use the same approach) and the log definitely shows two
threads getting the same lastId.

Whether you use a SQL standard "sequence" or not (or even the mysql

> "autoincrement" abomination), and regardless of the primary key status
> of the orderID (any SQL "unique" declaration could have the same effect,
> and a well designed order table should at LEAST have a unique orderID,
> whether it is part of the primary key or not), SQL transactions should
> resolve this issue easily and scalably.
>
> On the other hand, using locking or semaphores in the application code
> is reinventing the SQL wheel and fraught with possible error.  As David
> suggested, there may be multiple front ends talking to a single
> database, so a SQL solution is best.  I am merely respectfully
> suggesting that it would be better to let SQL do what it does best, and
> atomic transactions are the essence of good SQL.
>
> Hope this helps explain where I am coming from.
>

I hope you will look at the code in question and I hope you can explain why
transactions haven't worked.
MySQL5,  isolation-level="ReadCommitted".  I believe OFBiz is already taking
the approach you suggest, relying on SQL transactions to protect a critical
section, and it's not working for some reason.

Any or all of my assumptions could be wrong.  I just need to fix this so it
stops shutting us down.

--
James McGill
Phoenix AZ
Reply | Threaded
Open this post in threaded view
|

Re: QuoteId and OrderId sequence, possible race condition?

Matt Warnock
Thank you for the explanation, it is largely as I expected, but I am
stil learning my way around the ofbiz way.  I appreciate the help.

See inline comments below.
--
Matt Warnock <[hidden email]>
RidgeCrest Herbals, Inc.

On Fri, 2010-05-21 at 10:44 -0700, James McGill wrote:
> I will try to explain what we mean by "special" here.
>
> The Entity delegator has a facility to do sequence generation, per entity.
> That utility may or may not have a concurrency problem in distributed
> environments.  In any case, that's not being used here.
>
> Consider a scenario where you have two companies that you are doing invoices
> for.  Call them Company "AA" and company "BB", and let these be the
> partyIds.

Seems like this might arise even for separate warehouses of the same
company, if each has a separate invoice sequence.

>
> Let us introduce a business rule that requires invoices to be numbered in
> the sequence "AA1, AA2 ..." and "BB1, BB2 ...".
>
> This rule is implemented by OFBiz:  In PartyAcctgPreference, keyed by
> PartyId, there is a flag "INVSQ_ENF_SEQ".  I think that stands for "Invoice
> Sequence Enforce Sequential".  In that PartyAcctgPreference, the sequence
> number for that Party is stored as "lastInvoiceNumber".
>
> Let's walk through the single-thread scenario for createInvoice.
> applications/accounting/script/org/ofbiz/accounting/invoice/InvoiceServices.xml#createInvoice
>
> The service opens a transaction.  The first thing it does is call
> createInvoiceId.  createInvoiceId also opens a transaction.  It looks up
> PartyAcctgPreference for "AA".  It checks INVSQ_ENF_SEQ.  Seeing "true", it
> then reads lastInvoiceNumber, increments it by one, writes it back to the
> Entity, returns this value to createInvoice, committing its transaction.
> createInvoiceId writes the Invoice and commits its transaction.   Everything
> is fine in a single thread, of course.

An here is where it breaks.  Nested transactions are one of the dicey
areas of SQL.  Some implementations don't even support them.  PostgreSQL
added this feature only in the last few years, and I don't know if I'd
trust it on a multi-platform design like ofbiz.  If it was all in one
transaction, we'd be fine, I think, but it has to be the outer one.  If
the inner transaction commits, it should be provisional, and should ALSO
roll back if the outer one does.

>
> In multiple threads, we see two threads  reading the same lastInvoiceNumber
> for a party, even though createInvoiceId and createInvoice each runs under a
> transaction.  To add to the puzzle, we see this happen even when those two
> threads are more than a few seconds apart (an eternity).
>
> I'm still not convinced that putting createInvoiceId in a synchronized
> section will solve this problem, because the service already uses a
> transaction, which means it is already treated as a critical section.   I
> think that createInvoice itself should be exclusive to the thread as well,
> since it looks like two threads enter createInvoice, then they both get the
> same invoiceId from createInvoiceId even though that shouldn't be possible
> because of the transaction.
>
> Once this happens, it becomes impossible to use the system until
> lastInvoiceNumber is incremented for the party.  I'm trying to write a test
> case that triggers the bug so that we can show it's fixed by adding
> synchronized to the minilang.
>
Another issue may be the fact that we are keeping a lastInvoiceNumber
separate from the table.  That seems like it might violate normalization
rules.  Why not look up the lastInvoiceNumber value in the Invoice table
itself?  As long as the lookup, increment, and write operations are all
in the same transaction, you should be OK.  If the ID field is properly
indexed, a "SELECT invoiceID FROM invoice DESCENDING LIMIT 1 WHERE
partyID = ?" shouldn't be enough slower than "SELECT lastInvoiceNumber
FROM AccountingPref WHERE partyID = ?" to make the difference
worthwhile.  MySQL's non-standard "autoincrement" seems to do this
internally.  While I am not a fan of the fact that it is non-standard,
it does have some advantages.  Stored procedures could do the same thing
in a more standard way.  

>
> Now, MOST businesses will require that every invoice number is unique,
> > and MANY will further require that all order/invoice/check numbers be
> > accounted for, even on prenumbered forms (no "gaps") for auditing
> > purposes.  Is that "special" to ofbiz?
> >
>
> No.  At the risk of being repetitive, consider the case with multiple
> parties, each of which needs sequential invoices.  If the PK of Invoice were
> simply a serial number, the normal sequence generator would work. But if we
> need a serial number *per party*, then we need to generate a sequence per
> party and use the partyId as a prefix.  (At least this is the way OFBiz does
> it).   So this way, company AA can have invoiceId AA1, and company BB can
> have invoiceId BB1, and these can each be used as a PK in the same column.
>
Which is fine.  

>
> Alternatively, some businesses I have seen enforce a particular format
> > on the invoice/order/document number (like OOOYYYYMMDDSSS, where OOO is
> > the office number, followed by a date, and SSS is a sequence number
> > within the day), and maybe that is where the trouble lies.  In the US,
> > Social Security numbers are generated on a "special" pattern like this.
> >
>
> You would probably use "referenceNumber" for something like this.  Your
> "SSS" component of your field would get you to the same place as the bug
> we're discussing, given a partyId of "OOO".
>
> You need an thread-safe/atomic method of generating SSS.  Even though
> getInvoiceId runs in a transaction, two threads get the same SSS.
>
I think we are on the same page here. I was envisioning multiple
offices, like 528 for South California and 552 for Utah, to borrow from
the Social Security number case.

>
> > In either event, I remain convinced, and on reviewing the thread, David
> > Jones seems to agree, that this problem cannot be easily solved except
> > with atomic SQL transactions.  Is there some reason they are not being
> > used in this situation?  Or are they in use, and is the OP using a
> > REALLY broken JDBC?
>
>
> mysql-connector-java-5.1.8-bin.jar
>
>
> > Or are they in use, but a record write has
> > inadvertently snuck outside the transaction?
> >
>
> That's what I'm wondering.

As of MySQL 5.5 at least, transactions cannot be nested.
http://dev.mysql.com/doc/refman/5.5/en/implicit-commit.html

>
> At our company, we have two people putting orders in all morning (and we
> > are a very small company), and the chance of a collision in your
> > scenario is NOT rare, but quite high indeed, as the OP has already
> > pointed out, even on a single server with multiple users.
>
>
> If you only generate them for a single PartyID and you don't have
> orderSequenceEnumId=ODRSQ_ENF_SEQ
> you won't ever hit this part of the service.
>
Understood.

>
> > The issue is not that the sequence generator is "slow" to store the new
> > invoice number, but rather that "last ID" read, the "new ID" generator,
> > and the "new ID" write are not ALL encased inside a transaction that
> > will ensure that the database is ALWAYS internally consistent.  The
> > "race" occurs because there is no enclosing transaction, and the "race"
> > will never be completely controlled by being "fast" enough.
> >
>
> But there supposedly is an enclosing transaction.  The services run in a
> transaction by default, and just to be sure, I've explicitly set
> use-transaction="true" for both.  So it's like this:
>
> begin(Tx1)
>   createInvoice
>   begin(Tx2)
>     getInvoiceId
>   commit(Tx2)
> commit(Tx1)
>
> It looks like the right idea to me, but it ends up not working, and I
> haven't been smart enough (so far) to figure out why.
>
I'm almost certain nested transactions are the issue.

>
> > If there is a long lag between read and write (as where a user is
> > entering order line items) then you either have to 1) do the
> > read/increment/write at the END of the data entry process, or 2) face
> > the high likelihood that two entry processes will collide, and one of
> > them will have to roll back and then be renumbered.
> >
>
> I don't see a way to do try/retry looping in minilang.   I think you'd need
> an ECA to handle the error condition, but I think there's nothing wrong with
> Java (to sum up how I feel about minilang in general.)
>
>
> > Most software I have seen is not very bright about this, and it reads
> > and generates the new order number first, displaying it on the input
> > screen as data is entered.
>
>
> Please look at createInvoice and how it gets the Id.  There's no UI event
> involved, and since services run in a transaction by default, it doesn't
> really look like the design is wrong.  I might not even be right about the
> source of the problem.  But it breaks about once a week, on both Quote and
> Invoice (which use the same approach) and the log definitely shows two
> threads getting the same lastId.
>
> Whether you use a SQL standard "sequence" or not (or even the mysql
> > "autoincrement" abomination), and regardless of the primary key status
> > of the orderID (any SQL "unique" declaration could have the same effect,
> > and a well designed order table should at LEAST have a unique orderID,
> > whether it is part of the primary key or not), SQL transactions should
> > resolve this issue easily and scalably.
> >
> > On the other hand, using locking or semaphores in the application code
> > is reinventing the SQL wheel and fraught with possible error.  As David
> > suggested, there may be multiple front ends talking to a single
> > database, so a SQL solution is best.  I am merely respectfully
> > suggesting that it would be better to let SQL do what it does best, and
> > atomic transactions are the essence of good SQL.
> >
> > Hope this helps explain where I am coming from.
> >
>
> I hope you will look at the code in question and I hope you can explain why
> transactions haven't worked.
> MySQL5,  isolation-level="ReadCommitted".  I believe OFBiz is already taking
> the approach you suggest, relying on SQL transactions to protect a critical
> section, and it's not working for some reason.
>
> Any or all of my assumptions could be wrong.  I just need to fix this so it
> stops shutting us down.
>

Reply | Threaded
Open this post in threaded view
|

Re: QuoteId and OrderId sequence, possible race condition?

Matt Warnock
> > The service opens a transaction.  The first thing it does is call
> > createInvoice The service opens a transaction.  The first thing it does is call
> > > createInvoiceId.  createInvoiceId also opens a transaction.  It looks up
> > > PartyAcctgPreference for "AA".  It checks INVSQ_ENF_SEQ.  Seeing "true", it
> > > then reads lastInvoiceNumber, increments it by one, writes it back to the
> > > Entity, returns this value to createInvoice, committing its transaction.
> > > createInvoiceId writes the Invoice and commits its transaction.   Everything
> > > is fine in a single thread, of course.
> > Id.  createInvoiceId also opens a transaction.  It looks up
> > PartyAcctgPreference for "AA".  It checks INVSQ_ENF_SEQ.  Seeing "true", it
> > then reads lastInvoiceNumber, increments it by one, writes it back to the
> > Entity, returns this value to createInvoice, committing its transaction.
> > createInvoiceId writes the Invoice and commits its transaction.   Everything
> > is fine in a single thread, of course.

Sounds like there are actually 3 nested transactions here.  Big problem
for MySQL.  For the short run, I would probably try refactoring the
createInvoiceID into the createInvoice code, and see if I could turn
off the outermost enclosing service transaction.  You could also test
it with PostgreSQL or something that at least purports to support
nested transactions, but I'd still not trust nested transactions, if I
were writing the code.

Wish I could be more help, but I don't know minilang and ofbiz like
SQL, yet.

--
Matt Warnock <[hidden email]>
RidgeCrest Herbals, Inc.



12