Administrator
|
Hi Deyan,
Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . Jacques ----- Original Message ----- From: Deyan Tsvetanov To: Jacques Le Roux Cc: Hans Bakker ; [hidden email] Sent: Tuesday, April 06, 2010 9:42 AM Subject: Re: ofbiz entity sync. Hi Sirs, I'm sorry, i got lost for some time :) Things happen :) There was a lot of discussion on the topic, but the summary: there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection between the two servers, sync speed - RMI can be removed , etc. There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work out the world financial crisis :) So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports triggers - so mysql won't work :) That was just the first part. The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them pushed only and some of them synced in both directions. Next you need to test, reset the database and test again and again until you identify the correct list of the tables so your sync process doesn't end up with FK insert / update errors. That is pretty easy but time consuming task - it takes few days to complete :) So that's all I can say for now, without getting your bored with details :) If you have other questions - go ahead :) Cheers, Deyan -----Original Message----- From: Jacques Le Roux <[hidden email]> Reply-to: "Jacques Le Roux" <[hidden email]> To: Hans Bakker <[hidden email]>, [hidden email], [hidden email] Subject: Re: ofbiz entity sync. Date: Sat, 3 Apr 2010 10:04:29 +0200 Hi Hans, I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling Thanks Jacques From: "Hans Bakker" <[hidden email]> > Hi Gentleman, > > may i ask if there was any progress on this issue? > > https://issues.apache.org/jira/browse/OFBIZ-3333 > > I added the following comment: > > We have a customer using entity sync and experiencing the problems in > this issue. > > We are interested working with other parties on this subject. If there > is already substantial work done we are interested to compensate > financially for it. > > Please let me know if there is any progress here. > > regards, > Hans > > -- > http://www.antwebsystems.com : > Quality OFBiz support for competitive rates.... > > |
Jacques Le Roux wrote:
> Hi Deyan, > > Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . > There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) > > So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. The problem with that approach is that it is database-specific. As an alternative, you could query the other servers for their current time and generate a set of offsets to adjust the timing. -Adrian |
Administrator
|
From: "Adrian Crum" <[hidden email]>
> Jacques Le Roux wrote: >> Hi Deyan, >> >> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in >> OFBIZ-3333 . >> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be >> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the >> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of >> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in >> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint >> issues. Examples I could give but I guess you could think of such by yourselves :) >> >> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major >> databases - m$ $ql, oracle. > > The problem with that approach is that it is database-specific. As an alternative, you could query the other servers for their > current time and generate a set of offsets to adjust the timing. Why asking from one to the others? Using NTP on all of them should be enough, or do I miss something (looks like it's used telecom, air control, videoconferences, etc.). Of course the timezone needs to be set equally on each. Googling for NTP related stuff I found http://en.wikipedia.org/wiki/SyncML. Looks like there are a lot of solutions out there (Funambol being unfortunately AGPL licenced)? Deyan has also suggested to create a new way with "entity triggers", based on what he did with Postres. I'm not quite sure of what "entity triggers" should be, I guess EECAs. But of course this means more work, and maybe NTP is enough... PS: why mean Re: JUNK-> ? -Adrian > |
Jacques Le Roux wrote:
> From: "Adrian Crum" <[hidden email]> >> Jacques Le Roux wrote: >>> Hi Deyan, >>> >>> Thanks for your clear explanation and suggestion. As I'm busy with >>> another stuff, I have quickly put your comment as a quote in >>> OFBIZ-3333 . >>> There is however a major issue that CAN NOT be fixed in the current >>> implementation: the list and sequence of entities to be >>> synchronized gets created by entities' timestamp - date_created_tx >>> and last_update_tx. It works as long as the clocks of all the >>> syncing parties are in sync. You can easily achieve this by using NTP >>> for example - reliable enough. But if the clock of one of >>> the parties gets un-synced for just few minutes, and during those few >>> minutes records get inserted or updated than you are in >>> trouble. Syncing the clock back won't help you because you won't be >>> able to sync the broken records due to foreign key constraint >>> issues. Examples I could give but I guess you could think of such by >>> yourselves :) >>> >>> So IMHO the best approach for synchronization is not the timestamp >>> but the TRANSACTION LOG. This approach is used in all major >>> databases - m$ $ql, oracle. >> >> The problem with that approach is that it is database-specific. As an >> alternative, you could query the other servers for their >> current time and generate a set of offsets to adjust the timing. > > Why asking from one to the others? Using NTP on all of them should be > enough, or do I miss something Yes, you missed the part where one of the servers loses its NTP connection and its time drifts. > PS: why mean Re: JUNK-> ? Sorry about that. Our security appliance prepends JUNK-> to the subject line of suspected spam emails. Most likely your mail server is on a blacklist somewhere. I try to remove it before replying, but sometimes I forget. -Adrian |
Administrator
|
From: "Adrian Crum" <[hidden email]>
> Jacques Le Roux wrote: >> From: "Adrian Crum" <[hidden email]> >>> Jacques Le Roux wrote: >>>> Hi Deyan, >>>> >>>> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in >>>> OFBIZ-3333 . >>>> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be >>>> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all >>>> the >>>> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of >>>> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in >>>> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key >>>> constraint >>>> issues. Examples I could give but I guess you could think of such by yourselves :) >>>> >>>> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all >>>> major >>>> databases - m$ $ql, oracle. >>> >>> The problem with that approach is that it is database-specific. As an alternative, you could query the other servers for their >>> current time and generate a set of offsets to adjust the timing. >> >> Why asking from one to the others? Using NTP on all of them should be enough, or do I miss something > > Yes, you missed the part where one of the servers loses its NTP connection and its time drifts. I see, so a check between them should be done also, right. One important point is also the time interval between 2 sync runs. By default OOTB it's 5min for push and 1 hour for pull. This lets a reasonnable period of time before the servers have really lost their time syncs enough to be a problem. And could even be increased (say 15 mins for push) to avoid as much as possible this issue. Also an alert on the NTP connection loss could add some security without needing to implement a check between servers, etc. Jacques >> PS: why mean Re: JUNK-> ? > > Sorry about that. Our security appliance prepends JUNK-> to the subject line of suspected spam emails. Most likely your mail > server is on a blacklist somewhere. I try to remove it before replying, but sometimes I forget. > > -Adrian > |
In reply to this post by Adrian Crum
Hi Adrian,
yes , my current approach is database specific. It was implemented under pressure when I found out that ofbiz sync does not work, at least in my setup :) So my suggestion and idea is to implement database independent solution, I would do it but may be in few months. The approach suggests that we implement a triggering function in entity engine which would allow us to do event based handling on insert / update / delete events. So in general we say: onInsert for a specific entity execute some java method. Entity Engine would either do that event handling completely in it's java code or would generate and create real database triggers, if the db engine supports it. I don't know if there is such a feature already, I did a very quick search in the docs and found nothing. May be Jacques could help here ? :) Jacques, off the topic - is there an option to call native stored procedure in the database from entity engine ? It comes very handy sometimes ;) > As an alternative, you could query the other servers for their current time and generate a set of offsets to adjust the timing. Deyan: Yes, but the clock of the main server also could run faster or slower and in few weeks or months would eventually get synced by NTP. That would also cause issues. Another problem is that you insert entities with local timestamp and synchronize them using a remote one for comparison. So you need to calculate time diff, too much logic i think :) For the transaction log method I couldn't think of cons. I'll send you some example stored procedures and description, that would illustrate the approach I use best i believe :) You might find something, i'll be happy to discuss it :) The goal here is to create a very reliable mechanism, that runs fast, does not make much traffic, does not require lot of cpu and resources - like java heap memory, recovers from interruptions reliably ( at least logically reliably ), configures easily, troubleshoots easily ( ! ) , implements easily ( if possible :) and works on all databases, If possible :) I'll send the stuff tomorrow so we know what are we discussing :) Cheers, Deyan -----Original Message----- From: Adrian Crum <[hidden email]> Reply-to: [hidden email] To: [hidden email] Subject: Re: JUNK->Re: ofbiz entity sync. Date: Tue, 06 Apr 2010 07:27:50 -0700 Jacques Le Roux wrote: > Hi Deyan, > > Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . > There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) > > So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. The problem with that approach is that it is database-specific. As an alternative, you could query the other servers for their current time and generate a set of offsets to adjust the timing. -Adrian |
Deyan Tsvetanov wrote:
> Hi Adrian, > > yes , my current approach is database specific. It was implemented under > pressure when I found out that ofbiz sync does not work, at least in my > setup :) > > So my suggestion and idea is to implement database independent solution, > I would do it but may be in few months. > The approach suggests that we implement a triggering function in entity > engine which would allow us to do event based handling on insert / > update / delete events. So in general we say: onInsert for a specific > entity execute some java method. > > Entity Engine would either do that event handling completely in it's > java code or would generate and create real database triggers, if the db > engine supports it. I don't know if there is such a feature already, I > did a very quick search in the docs and found nothing. May be > Jacques could help here ? :) > > Jacques, off the topic - is there an option to call native stored > procedure in the database from entity engine ? It comes very handy > sometimes ;) > >> As an alternative, you could query the other servers for their > current time > > and generate a set of offsets to adjust the timing. > > > Deyan: Yes, but the clock of the main server also could run faster or > slower and in few weeks or months would eventually get synced by NTP. > That would also cause issues. Another problem is that you insert > entities with local timestamp and synchronize them using a remote one > for comparison. So you need to calculate time diff, too much logic i > think :) I might be wrong, but the only issue I see with servers time drifting is when to trigger an update. I'm not an expert on entity sync, but from what I understand, timestamps are compared to see if server B's data changed since the last sync. So, you query server B for its current time, calculate an offset from your time, and adjust the query's timestamp. -Adrian |
Here is an example:
On server A ( the one that pushes data ) we insert the following records with the following sequence and timestamps: PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 20:00:00.010 PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , created_timestamp_tx: 6-APR-2010 20:00:00:013 - three milliseconds later. That would the the database contents without issues. Now imagine we do a NTP time synchronization between the two inserts. The NTP sync "corrects" the system clock 5 ms back. The database contents now would be: PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 20:00:00.010 PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , created_timestamp_tx: 6-APR-2010 20:00:00:008 - earlier than the PARTY record. When the server queries the records to be sent for synchronization the result will be: PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , created_timestamp_tx: 6-APR-2010 20:00:00:008 PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 20:00:00.010 The server that accepts the push sync will try to insert first the PARTY_GROUP record and will get a FK constraint error - the PARTY record is missing. And that is just an easy example - with parties and party groups. The examples with invoices, orders, items and last but not least: acctg_trans and acctg_trans_entry records are much more complex to recover from. So that is my concern actually. I experienced it, also it's pretty easy to verify :) -- deyan -----Original Message----- From: Adrian Crum <[hidden email]> Reply-to: [hidden email] To: [hidden email] Subject: Re: JUNK->Re: ofbiz entity sync. Date: Tue, 06 Apr 2010 10:48:28 -0700 Deyan Tsvetanov wrote: > Hi Adrian, > > yes , my current approach is database specific. It was implemented under > pressure when I found out that ofbiz sync does not work, at least in my > setup :) > > So my suggestion and idea is to implement database independent solution, > I would do it but may be in few months. > The approach suggests that we implement a triggering function in entity > engine which would allow us to do event based handling on insert / > update / delete events. So in general we say: onInsert for a specific > entity execute some java method. > > Entity Engine would either do that event handling completely in it's > java code or would generate and create real database triggers, if the db > engine supports it. I don't know if there is such a feature already, I > did a very quick search in the docs and found nothing. May be > Jacques could help here ? :) > > Jacques, off the topic - is there an option to call native stored > procedure in the database from entity engine ? It comes very handy > sometimes ;) > >> As an alternative, you could query the other servers for their > current time > > and generate a set of offsets to adjust the timing. > > > Deyan: Yes, but the clock of the main server also could run faster or > slower and in few weeks or months would eventually get synced by NTP. > That would also cause issues. Another problem is that you insert > entities with local timestamp and synchronize them using a remote one > for comparison. So you need to calculate time diff, too much logic i > think :) I might be wrong, but the only issue I see with servers time drifting is when to trigger an update. I'm not an expert on entity sync, but from what I understand, timestamps are compared to see if server B's data changed since the last sync. So, you query server B for its current time, calculate an offset from your time, and adjust the query's timestamp. -Adrian |
Deyan Tsvetanov wrote:
> Here is an example: > On server A ( the one that pushes data ) we insert the following records > with the following sequence and timestamps: > > PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 > 20:00:00.010 > PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , > created_timestamp_tx: 6-APR-2010 20:00:00:013 - three milliseconds > later. > > That would the the database contents without issues. > Now imagine we do a NTP time synchronization between the two inserts. > The NTP sync "corrects" the system clock 5 ms back. The database > contents now would be: > > PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 > 20:00:00.010 > PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , > created_timestamp_tx: 6-APR-2010 20:00:00:008 - earlier than the PARTY > record. > > When the server queries the records to be sent for synchronization the > result will be: > > PARTY_GROUP : party_group_id : 1000 ( FK to PARTY.party_id ) , > created_timestamp_tx: 6-APR-2010 20:00:00:008 > PARTY : party_id: 1000 , created_timestamp_tx : 6-APR -2010 > 20:00:00.010 > > The server that accepts the push sync will try to insert first the > PARTY_GROUP record and will get a FK constraint error - the PARTY record > is missing. Okay, that's a different problem than the one I was imagining. |
Administrator
|
In reply to this post by Deyan Tsvetanov
From: "Deyan Tsvetanov" <[hidden email]>
> Hi Adrian, > > yes , my current approach is database specific. It was implemented under > pressure when I found out that ofbiz sync does not work, at least in my > setup :) > > So my suggestion and idea is to implement database independent solution, > I would do it but may be in few months. > The approach suggests that we implement a triggering function in entity > engine which would allow us to do event based handling on insert / > update / delete events. So in general we say: onInsert for a specific > entity execute some java method. > > Entity Engine would either do that event handling completely in it's > java code or would generate and create real database triggers, if the db > engine supports it. I don't know if there is such a feature already, I > did a very quick search in the docs and found nothing. May be > Jacques could help here ? :) There are no such features in OFBiz. Note that you said that MySql has no triggers but since 5.0.2 there are triggers http://dev.mysql.com/doc/refman/5.1/en/trigger-syntax.html Actually, most modern DBMS have triggers, the problems with them are syntax differences http://en.wikipedia.org/wiki/Database_trigger. We could try to use common denominator (if it exists, I did not check), but some are still using older DBMS without any such capabilities. > Jacques, off the topic - is there an option to call native stored > procedure in the database from entity engine ? It comes very handy > sometimes ;) No this does not exist for the same reason http://en.wikipedia.org/wiki/Stored_procedure Jacques PS: I saw your answer to Adrian and your example. Yes this seems to be a real problem, I can't find a good solution so far :/ If we could avoid to have NTP sync the machine during transactions it should not be a problem. I found http://commons.apache.org/net/api-release/org/apache/commons/net/ntp/TimeInfo.html But there are maybe other problems you crossed during your experiment? >> As an alternative, you could query the other servers for their > current time > > and generate a set of offsets to adjust the timing. > > > Deyan: Yes, but the clock of the main server also could run faster or > slower and in few weeks or months would eventually get synced by NTP. > That would also cause issues. Another problem is that you insert > entities with local timestamp and synchronize them using a remote one > for comparison. So you need to calculate time diff, too much logic i > think :) > > For the transaction log method I couldn't think of cons. I'll send you > some example stored procedures and description, that would illustrate > the approach I use best i believe :) You might find something, i'll be > happy to discuss it :) > > The goal here is to create a very reliable mechanism, that runs fast, > does not make much traffic, does not require lot of cpu and resources - > like java heap memory, recovers from interruptions reliably ( at least > logically reliably ), configures easily, troubleshoots easily ( ! ) , > implements easily ( if possible :) and works on all databases, If > possible :) > > I'll send the stuff tomorrow so we know what are we discussing :) > > Cheers, > Deyan > -----Original Message----- > From: Adrian Crum <[hidden email]> > Reply-to: [hidden email] > To: [hidden email] > Subject: Re: JUNK->Re: ofbiz entity sync. > Date: Tue, 06 Apr 2010 07:27:50 -0700 > > > Jacques Le Roux wrote: >> Hi Deyan, >> >> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in >> OFBIZ-3333 . >> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be >> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the >> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of >> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in >> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint >> issues. Examples I could give but I guess you could think of such by yourselves :) >> >> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major >> databases - m$ $ql, oracle. > > The problem with that approach is that it is database-specific. As an > alternative, you could query the other servers for their current time > and generate a set of offsets to adjust the timing. > > -Adrian > > > |
-----Original Message-----
From: Jacques Le Roux <[hidden email]> Reply-to: "Jacques Le Roux" <[hidden email]> To: [hidden email] Subject: Re: JUNK->Re: ofbiz entity sync. Date: Tue, 6 Apr 2010 21:30:24 +0200 From: "Deyan Tsvetanov" <[hidden email]> > Hi Adrian, > > yes , my current approach is database specific. It was implemented under > pressure when I found out that ofbiz sync does not work, at least in my > setup :) > > So my suggestion and idea is to implement database independent solution, > I would do it but may be in few months. > The approach suggests that we implement a triggering function in entity > engine which would allow us to do event based handling on insert / > update / delete events. So in general we say: onInsert for a specific > entity execute some java method. > > Entity Engine would either do that event handling completely in it's > java code or would generate and create real database triggers, if the db > engine supports it. I don't know if there is such a feature already, I > did a very quick search in the docs and found nothing. May be > Jacques could help here ? :) There are no such features in OFBiz. Note that you said that MySql has no triggers but since 5.0.2 there are triggers http://dev.mysql.com/doc/refman/5.1/en/trigger-syntax.html Actually, most modern DBMS have triggers, the problems with them are syntax differences http://en.wikipedia.org/wiki/Database_trigger. We could try to use common denominator (if it exists, I did not check), but some are still using older DBMS without any such capabilities. [Deyan] Well yes, the triggers are database specific, but isn't what entity engine is about to provide: unified database access layer ? Triggers are also an almost universal database feature. They could be very useful, especially for extra security in ofbiz, synchronization, integration with 3rd party systems or home grown software like most of companies have. Most of the databases have it, for mysql we have to increase the minimum required version to 5. It's not a pain. At some point we might have the question: so mysql does not support things that all others do support. What do we do: do we implement the missing features in entity engine's java code ? How much effort will it take ? Why do we have to implement stuff that is already provided by many other products for free and in a good quality ? In the end of the day the goal is to provide lot of good features, erp, enterprise class features for a low cost to users, customers, companies. Providing both options - open source and commercial database engines support is required and very good, keeps TCO down. But if some database isn't improving as fast as needed than it shouldn't not be a problem to drop the support for it. Mysql is targeted to different types of customers and users who have different requirements than ours - like web developers, php, jsp, perl, whatever. So I say for now we implement the entity events / triggers in java, for the purpose of ofbiz synchronization it will be enough. We don't need native db triggers yet. > Jacques, off the topic - is there an option to call native stored > procedure in the database from entity engine ? It comes very handy > sometimes ;) No this does not exist for the same reason http://en.wikipedia.org/wiki/Stored_procedure [Deyan] Why do we need native stored procedures ? OFBiz database is complex. Its more flexible and more complex than the database of a typical web site. Customers want reports. For reports we use tools like BIRT, Pentaho, etc. For reports calculations and functions are needed, often not supported by entity engine. SUM and GROUP_BY sometimes are not enough.For reports the grouping and calculations need to happen in the database engine as much as possible. If you let the BI tools to iterate result sets in a JVM then you're in trouble with "memory issues" and limits :) SPs are also convenient when doing complex database queries from a complex search GUI screen. So here is where it comes in hand to be able to call native DB stored procedures. Entity engine would have to generate db specific code, either "call sp_select_average_sales()" or "SELECT * FROM sp_select_average_sales()". It's not hard, it's the purpose of entity engine actually. And then code the stored procedures for all engines you like to support. Jacques PS: I saw your answer to Adrian and your example. Yes this seems to be a real problem, I can't find a good solution so far :/ If we could avoid to have NTP sync the machine during transactions it should not be a problem. I found http://commons.apache.org/net/api-release/org/apache/commons/net/ntp/TimeInfo.html But there are maybe other problems you crossed during your experiment? [Deyan] dealing with timestamps is difficult. In the end of the day the question is: why do we need created_timestamp_tx and last_update_tx for all entities / db tables ? Answer: we need them in order to generate a list of entities ( database records ) to be pushed / pulled and that list must be in the correct order so we avoid FK or PK constraint issues. Well there is another way to generate that list without dealing with timestamps and time offsets at all. We can generate that list on the fly - in real time - using triggers which upon execution create an entry in a transaction log. This approach would guarantee that if a PARTY is created before a PARTY_GROUP in the real life - the same sequence of actions will be appended to the transaction log. I'll send you some examples a bit later today. -- deyan >> As an alternative, you could query the other servers for their > current time > > and generate a set of offsets to adjust the timing. > > > Deyan: Yes, but the clock of the main server also could run faster or > slower and in few weeks or months would eventually get synced by NTP. > That would also cause issues. Another problem is that you insert > entities with local timestamp and synchronize them using a remote one > for comparison. So you need to calculate time diff, too much logic i > think :) > > For the transaction log method I couldn't think of cons. I'll send you > some example stored procedures and description, that would illustrate > the approach I use best i believe :) You might find something, i'll be > happy to discuss it :) > > The goal here is to create a very reliable mechanism, that runs fast, > does not make much traffic, does not require lot of cpu and resources - > like java heap memory, recovers from interruptions reliably ( at least > logically reliably ), configures easily, troubleshoots easily ( ! ) , > implements easily ( if possible :) and works on all databases, If > possible :) > > I'll send the stuff tomorrow so we know what are we discussing :) > > Cheers, > Deyan > -----Original Message----- > From: Adrian Crum <[hidden email]> > Reply-to: [hidden email] > To: [hidden email] > Subject: Re: JUNK->Re: ofbiz entity sync. > Date: Tue, 06 Apr 2010 07:27:50 -0700 > > > Jacques Le Roux wrote: >> Hi Deyan, >> >> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in >> OFBIZ-3333 . >> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be >> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the >> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of >> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in >> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint >> issues. Examples I could give but I guess you could think of such by yourselves :) >> >> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major >> databases - m$ $ql, oracle. > > The problem with that approach is that it is database-specific. As an > alternative, you could query the other servers for their current time > and generate a set of offsets to adjust the timing. > > -Adrian > > > |
In reply to this post by Jacques Le Roux
Hi guys,
so about the transaction log: I have a table called TRANSLOG with the following structure: CREATE TABLE translog ( id bigserial NOT NULL, "table" character varying(30) NOT NULL, operation character varying(6) NOT NULL, "statement" character varying NOT NULL, created_on timestamp with time zone NOT NULL DEFAULT now(), CONSTRAINT id_fk PRIMARY KEY (id) ) Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE and DELETE. The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot shows some data in the TRANSLOG table: So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry. Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has occured. The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz. For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached translog_id so we know from where to start the next time when the sync SP is invoked. So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant . Here's the screenshot of TRANSLOG_EXEC_LOG: The result message could be either OK or the DB error message - either a constraint error or some other error. We always know which was the last successfully executed statement ( last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last successful one. That's all in general. -- deyan -----Original Message----- From: Jacques Le Roux <[hidden email]> Reply-to: "Jacques Le Roux" <[hidden email]> To: Deyan Tsvetanov <[hidden email]>, [hidden email] Cc: Hans Bakker <[hidden email]>, [hidden email] Subject: Re: ofbiz entity sync. Date: Tue, 6 Apr 2010 12:31:49 +0200 Hi Deyan, Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . Jacques ----- Original Message ----- From: [hidden email] To: [hidden email] Cc: [hidden email] ; [hidden email] Sent: Tuesday, April 06, 2010 9:42 AM Subject: Re: ofbiz entity sync. Hi Sirs, |
In reply to this post by Jacques Le Roux
it needs to be non database specific.
to be used in ofbiz respository. so using a entity and eeca would work to take a look at the bi code ========================= BJ Freeman http://bjfreeman.elance.com Strategic Power Office with Supplier Automation <http://www.businessesnetwork.com/automation/viewforum.php?f=93> Specialtymarket.com <http://www.specialtymarket.com/> Systems Integrator-- Glad to Assist Chat Y! messenger: bjfr33man <http://www.linkedin.com/profile?viewProfile=&key=1237480&locale=en_US&trk=tab_pro> Deyan Tsvetanov sent the following on 4/7/2010 1:41 AM: > Hi guys, > > so about the transaction log: > > I have a table called TRANSLOG with the following structure: > > > CREATE TABLE translog > ( > id bigserial NOT NULL, > "table" character varying(30) NOT NULL, > operation character varying(6) NOT NULL, > "statement" character varying NOT NULL, > created_on timestamp with time zone NOT NULL DEFAULT now(), > CONSTRAINT id_fk PRIMARY KEY (id) > ) > > Records in the TRANSLOG table are being inserted by the triggers for the > table we are "watching" for changes on INSERT, UPDATE and DELETE. > The STATEMENT column contains executable SQL statement which is > generated by the trigger function. The following screenshot shows some > data in the TRANSLOG table: > > > > So every time some database action is performed on the tables we are > "watching" we get a TRANSLOG entry. > > Each translog entry has a sequence ID. So when we want to PUSH that > transaction log to the main server we actually execute the statements ON > the main server. We execute the statements wrapped in a transaction > which gets rolled back if an error has occured. > > The translog push gets invoked by a stored procedure which gets called > by a scheduled reoccuring job in ofbiz. > For each sync SP execution we store a record in a table called > TRANSLOG_EXEC_LOG . There we keep the currently reached translog_id so > we know from where to start the next time > when the sync SP is invoked. > > So in general that's the mechanism. Pretty simple to implement, pretty > simple to troubleshoot and pretty error resistant . Here's the > screenshot of TRANSLOG_EXEC_LOG: > > > > The result message could be either OK or the DB error message - either a > constraint error or some other error. We always know which was the last > successfully executed statement ( > last_command_id ). If there is an error then we know at which statement > the error has happened - the next after the last successful one. > > That's all in general. > > -- deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Deyan Tsvetanov <[hidden email]>, > [hidden email] > Cc: Hans Bakker <[hidden email]>, [hidden email] > Subject: Re: ofbiz entity sync. > Date: Tue, 6 Apr 2010 12:31:49 +0200 > > > Hi Deyan, > > Thanks for your clear explanation and suggestion. As I'm busy with > another stuff, I have quickly put your comment as a quote in > OFBIZ-3333 . > > Jacques > > ----- Original Message ----- > From: Deyan Tsvetanov > To: Jacques Le Roux > Cc: Hans Bakker ; [hidden email] > Sent: Tuesday, April 06, 2010 9:42 AM > Subject: Re: ofbiz entity sync. > > > > Hi Sirs, > > I'm sorry, i got lost for some time :) > Things happen :) > > There was a lot of discussion on the topic, but the summary: > > there are some major issues that CAN be fixed in the current > implementation: error recovery - due to interrupted connection > between the two servers, sync speed - RMI can be removed , etc. > > There is however a major issue that CAN NOT be fixed in the > current implementation: the list and sequence of entities to be > synchronized gets created by entities' timestamp - > date_created_tx and last_update_tx. It works as long as the > clocks of all the syncing parties are in sync. You can easily > achieve this by using NTP for example - reliable enough. But if > the clock of one of the parties gets un-synced for just few > minutes, and during those few minutes records get inserted or > updated than you are in trouble. Syncing the clock back won't > help you because you won't be able to sync the broken records > due to foreign key constraint issues. Examples I could give but > I guess you could think of such by yourselves :) > > So IMHO the best approach for synchronization is not the > timestamp but the TRANSACTION LOG. This approach is used in all > major databases - m$ $ql, oracle. > > For a customer I've implemented a transaction log based on > triggers and stored procedures. The transaction log, triggers > and the stored procedures however I implemented only postgresql > as that was the customer's database. It's easy to implement ms > sql or oracle version though. It works perfectly, much much much > faster than RMI, recovers if the sync process is interrupted , > etc. My goal was to implement this mechanism using entity engine > level triggers and eventually commit it, but unfortunately still > pretty busy with other things so we don't have resources that > can be dedicated to work on that task at the moment, we're > trying to work out the world financial crisis :) > > So if you find what i say reasonable you could go ahead with the > triggers and SPs. For that you need database that supports > triggers - so mysql won't work :) > > That was just the first part. > > The second part is to identify all the tables that you need to > synchronize. Some of them will be only pulled, some of them > pushed only and some of them synced in both directions. Next you > need to test, reset the database and test again and again until > you identify the correct list of the tables so your sync process > doesn't end up with FK insert / update errors. That is pretty > easy but time consuming task - it takes few days to complete :) > > So that's all I can say for now, without getting your bored with > details :) > If you have other questions - go ahead :) > > Cheers, > Deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Hans Bakker <[hidden email]>, > [hidden email], [hidden email] > Subject: Re: ofbiz entity sync. > Date: Sat, 3 Apr 2010 10:04:29 +0200 > > > Hi Hans, > > I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling > > Thanks > > Jacques > > From: "Hans Bakker" <[hidden email]> > > Hi Gentleman, > > > > may i ask if there was any progress on this issue? > > > > https://issues.apache.org/jira/browse/OFBIZ-3333 > > > > I added the following comment: > > > > We have a customer using entity sync and experiencing the problems in > > this issue. > > > > We are interested working with other parties on this subject. If there > > is already substantial work done we are interested to compensate > > financially for it. > > > > Please let me know if there is any progress here. > > > > regards, > > Hans > > > > -- > > http://www.antwebsystems.com : > > Quality OFBiz support for competitive rates.... > > > > > > > > > > |
In reply to this post by Deyan Tsvetanov
On Apr 7, 2010, at 2:04 AM, Deyan Tsvetanov wrote: > Jacques > PS: I saw your answer to Adrian and your example. Yes this seems to be a real problem, I can't find a good solution so far :/ If we > could avoid to have NTP sync the machine during transactions it should not be a problem. I found > http://commons.apache.org/net/api-release/org/apache/commons/net/ntp/TimeInfo.html But there are maybe other problems you crossed > during your experiment? > > [Deyan] dealing with timestamps is difficult. > > In the end of the day the question is: why do we need created_timestamp_tx and last_update_tx for all entities / db tables ? > Answer: we need them in order to generate a list of entities ( database records ) to be pushed / pulled and that list must be in the correct order > so we avoid FK or PK constraint issues. Actually this is NOT the purpose of the timestamps. The EntitySync stuff does use these to recreate the order as much as possible, but there is no way to solve problems with FKs when you have circular references or something is changed multiple times between syncs. If you're relying on this to solve FK problems you're going to have issues, guaranteed. The dummy-fks feature is what is meant to solve the FK problem. -David |
In reply to this post by Deyan Tsvetanov
A transaction has a lot of overhead, but is a good way to go. For removes the current EntitySync implementation does in fact use a sort of transaction log. The nice thing about timestamps is they tell you what records have been changed since a certain time so you know which records have changes that need to be updated since the last update time. There are issues with multiple app servers if their times are out of sync by more than a few minutes. Most other issues seem to be related to people not understanding how this works and how to handle different issues. Of course, that alone can be fatal flaw in a design and probably is in this case. To really do this you'd have to have a more structured record of the changes than just recording an SQL statement if you want to support syncing from one type of database to another (which is currently supported). -David On Apr 7, 2010, at 3:41 AM, Deyan Tsvetanov wrote: > Hi guys, > > so about the transaction log: > > I have a table called TRANSLOG with the following structure: > > > CREATE TABLE translog > ( > id bigserial NOT NULL, > "table" character varying(30) NOT NULL, > operation character varying(6) NOT NULL, > "statement" character varying NOT NULL, > created_on timestamp with time zone NOT NULL DEFAULT now(), > CONSTRAINT id_fk PRIMARY KEY (id) > ) > > Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE and DELETE. > The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot shows some data in the TRANSLOG table: > > > > So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry. > > Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has occured. > > The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz. > For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached translog_id so we know from where to start the next time > when the sync SP is invoked. > > So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant . Here's the screenshot of TRANSLOG_EXEC_LOG: > > > > The result message could be either OK or the DB error message - either a constraint error or some other error. We always know which was the last successfully executed statement ( > last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last successful one. > > That's all in general. > > -- deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Deyan Tsvetanov <[hidden email]>, [hidden email] > Cc: Hans Bakker <[hidden email]>, [hidden email] > Subject: Re: ofbiz entity sync. > Date: Tue, 6 Apr 2010 12:31:49 +0200 > > Hi Deyan, > > Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . > > Jacques > > ----- Original Message ----- > From: Deyan Tsvetanov > To: Jacques Le Roux > Cc: Hans Bakker ; [hidden email] > Sent: Tuesday, April 06, 2010 9:42 AM > Subject: Re: ofbiz entity sync. > > > Hi Sirs, > > I'm sorry, i got lost for some time :) > Things happen :) > > There was a lot of discussion on the topic, but the summary: > > there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection between the two servers, sync speed - RMI can be removed , etc. > > There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) > > So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. > > For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work out the world financial crisis :) > > So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports triggers - so mysql won't work :) > > That was just the first part. > > The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them pushed only and some of them synced in both directions. Next you need to test, reset the database and test again and again until you identify the correct list of the tables so your sync process doesn't end up with FK insert / update errors. That is pretty easy but time consuming task - it takes few days to complete :) > > So that's all I can say for now, without getting your bored with details :) > If you have other questions - go ahead :) > > Cheers, > Deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Hans Bakker <[hidden email]>, [hidden email], [hidden email] > Subject: Re: ofbiz entity sync. > Date: Sat, 3 Apr 2010 10:04:29 +0200 > > Hi Hans, > > I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling > > Thanks > > Jacques > > From: "Hans Bakker" < > [hidden email] > > > > Hi Gentleman, > > > > may i ask if there was any progress on this issue? > > > > > https://issues.apache.org/jira/browse/OFBIZ-3333 > > > > > I added the following comment: > > > > We have a customer using entity sync and experiencing the problems in > > this issue. > > > > We are interested working with other parties on this subject. If there > > is already substantial work done we are interested to compensate > > financially for it. > > > > Please let me know if there is any progress here. > > > > regards, > > Hans > > > > -- > > > http://www.antwebsystems.com > : > > Quality OFBiz support for competitive rates.... > > > > > > > > |
Administrator
|
In reply to this post by Deyan Tsvetanov
Hi Deyan,
Just a logistic tip: the OFBiz MLs don't support attachments nor included images. When you need to put an image in a message please use another mean. Anyway I guess we all understood your point, thanks for the details Jacques ----- Original Message ----- From: Deyan Tsvetanov To: Jacques Le Roux Cc: [hidden email] ; Hans Bakker ; [hidden email] Sent: Wednesday, April 07, 2010 10:41 AM Subject: Re: ofbiz entity sync. Hi guys, so about the transaction log: I have a table called TRANSLOG with the following structure: CREATE TABLE translog ( id bigserial NOT NULL, "table" character varying(30) NOT NULL, operation character varying(6) NOT NULL, "statement" character varying NOT NULL, created_on timestamp with time zone NOT NULL DEFAULT now(), CONSTRAINT id_fk PRIMARY KEY (id) ) Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE and DELETE. The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot shows some data in the TRANSLOG table: So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry. Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has occured. The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz. For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached translog_id so we know from where to start the next time when the sync SP is invoked. So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant . Here's the screenshot of TRANSLOG_EXEC_LOG: The result message could be either OK or the DB error message - either a constraint error or some other error. We always know which was the last successfully executed statement ( last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last successful one. That's all in general. -- deyan -----Original Message----- From: Jacques Le Roux <[hidden email]> Reply-to: "Jacques Le Roux" <[hidden email]> To: Deyan Tsvetanov <[hidden email]>, [hidden email] Cc: Hans Bakker <[hidden email]>, [hidden email] Subject: Re: ofbiz entity sync. Date: Tue, 6 Apr 2010 12:31:49 +0200 Hi Deyan, Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . Jacques ----- Original Message ----- From: Deyan Tsvetanov To: Jacques Le Roux Cc: Hans Bakker ; [hidden email] Sent: Tuesday, April 06, 2010 9:42 AM Subject: Re: ofbiz entity sync. Hi Sirs, I'm sorry, i got lost for some time :) Things happen :) There was a lot of discussion on the topic, but the summary: there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection between the two servers, sync speed - RMI can be removed , etc. There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work out the world financial crisis :) So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports triggers - so mysql won't work :) That was just the first part. The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them pushed only and some of them synced in both directions. Next you need to test, reset the database and test again and again until you identify the correct list of the tables so your sync process doesn't end up with FK insert / update errors. That is pretty easy but time consuming task - it takes few days to complete :) So that's all I can say for now, without getting your bored with details :) If you have other questions - go ahead :) Cheers, Deyan -----Original Message----- From: Jacques Le Roux <[hidden email]> Reply-to: "Jacques Le Roux" <[hidden email]> To: Hans Bakker <[hidden email]>, [hidden email], [hidden email] Subject: Re: ofbiz entity sync. Date: Sat, 3 Apr 2010 10:04:29 +0200 Hi Hans, I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling Thanks Jacques From: "Hans Bakker" <[hidden email]> > Hi Gentleman, > > may i ask if there was any progress on this issue? > > https://issues.apache.org/jira/browse/OFBIZ-3333 > > I added the following comment: > > We have a customer using entity sync and experiencing the problems in > this issue. > > We are interested working with other parties on this subject. If there > is already substantial work done we are interested to compensate > financially for it. > > Please let me know if there is any progress here. > > regards, > Hans > > -- > http://www.antwebsystems.com : > Quality OFBiz support for competitive rates.... > > |
In reply to this post by David E. Jones-2
>A transaction has a lot of overhead, but is a good way to go.
[Deyan] We can delete the old transaction log after successful sync has been done and 1 month has passed - or something like that. WE don't need to keep the transaction log for eternity, that's out of question :) > To really do this you'd have to have a more structured record of the changes than just recording an SQL statement if you want to support syncing from one type of database to another >(which is currently supported). [Deyan] The SQL statements I am currently generating are ANSI SQL - they work on all databases. But yes, a major requirement is that syncing from one type of DB to another must be supported. That's for sure -- Deyan -----Original Message----- From: David E Jones <[hidden email]> Reply-to: [hidden email] To: [hidden email] Cc: Jacques Le Roux <[hidden email]>, Hans Bakker <[hidden email]>, [hidden email] Subject: Re: ofbiz entity sync. Date: Wed, 07 Apr 2010 10:33:40 -0500 A transaction has a lot of overhead, but is a good way to go. For removes the current EntitySync implementation does in fact use a sort of transaction log. The nice thing about timestamps is they tell you what records have been changed since a certain time so you know which records have changes that need to be updated since the last update time. There are issues with multiple app servers if their times are out of sync by more than a few minutes. Most other issues seem to be related to people not understanding how this works and how to handle different issues. Of course, that alone can be fatal flaw in a design and probably is in this case. To really do this you'd have to have a more structured record of the changes than just recording an SQL statement if you want to support syncing from one type of database to another (which is currently supported). -David On Apr 7, 2010, at 3:41 AM, Deyan Tsvetanov wrote: > Hi guys, > > so about the transaction log: > > I have a table called TRANSLOG with the following structure: > > > CREATE TABLE translog > ( > id bigserial NOT NULL, > "table" character varying(30) NOT NULL, > operation character varying(6) NOT NULL, > "statement" character varying NOT NULL, > created_on timestamp with time zone NOT NULL DEFAULT now(), > CONSTRAINT id_fk PRIMARY KEY (id) > ) > > Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE and DELETE. > The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot shows some data in the TRANSLOG table: > > > > So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry. > > Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has occured. > > The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz. > For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached translog_id so we know from where to start the next time > when the sync SP is invoked. > > So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant . Here's the screenshot of TRANSLOG_EXEC_LOG: > > > > The result message could be either OK or the DB error message - either a constraint error or some other error. We always know which was the last successfully executed statement ( > last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last successful one. > > That's all in general. > > -- deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Deyan Tsvetanov <[hidden email]>, [hidden email] > Cc: Hans Bakker <[hidden email]>, [hidden email] > Subject: Re: ofbiz entity sync. > Date: Tue, 6 Apr 2010 12:31:49 +0200 > > Hi Deyan, > > Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in OFBIZ-3333 . > > Jacques > > ----- Original Message ----- > From: Deyan Tsvetanov > To: Jacques Le Roux > Cc: Hans Bakker ; [hidden email] > Sent: Tuesday, April 06, 2010 9:42 AM > Subject: Re: ofbiz entity sync. > > > Hi Sirs, > > I'm sorry, i got lost for some time :) > Things happen :) > > There was a lot of discussion on the topic, but the summary: > > there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection between the two servers, sync speed - RMI can be removed , etc. > > There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint issues. Examples I could give but I guess you could think of such by yourselves :) > > So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major databases - m$ $ql, oracle. > > For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work out the world financial crisis :) > > So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports triggers - so mysql won't work :) > > That was just the first part. > > The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them pushed only and some of them synced in both directions. Next you need to test, reset the database and test again and again until you identify the correct list of the tables so your sync process doesn't end up with FK insert / update errors. That is pretty easy but time consuming task - it takes few days to complete :) > > So that's all I can say for now, without getting your bored with details :) > If you have other questions - go ahead :) > > Cheers, > Deyan > > -----Original Message----- > From: Jacques Le Roux <[hidden email]> > Reply-to: "Jacques Le Roux" <[hidden email]> > To: Hans Bakker <[hidden email]>, [hidden email], [hidden email] > Subject: Re: ofbiz entity sync. > Date: Sat, 3 Apr 2010 10:04:29 +0200 > > Hi Hans, > > I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling > > Thanks > > Jacques > > From: "Hans Bakker" < > [hidden email] > > > > Hi Gentleman, > > > > may i ask if there was any progress on this issue? > > > > > https://issues.apache.org/jira/browse/OFBIZ-3333 > > > > > I added the following comment: > > > > We have a customer using entity sync and experiencing the problems in > > this issue. > > > > We are interested working with other parties on this subject. If there > > is already substantial work done we are interested to compensate > > financially for it. > > > > Please let me know if there is any progress here. > > > > regards, > > Hans > > > > -- > > > http://www.antwebsystems.com > : > > Quality OFBiz support for competitive rates.... > > > > > > > > |
Administrator
|
In reply to this post by Jacques Le Roux
For those interested, I have put the images in the related Jira issue (3333) and also a link from there to this thread
Jacques From: Deyan Tsvetanov Right ! ;) So image no. 1: http://img179.yfrog.com/img179/8520/translogscreenshot.png Image no. 2: http://img263.yfrog.com/img263/8302/translogexeclogscreensh.png -- deyan |
Hi Jacques, thanks, good action.
Regards, Hans On Wed, 2010-04-07 at 23:14 +0200, Jacques Le Roux wrote: > For those interested, I have put the images in the related Jira issue (3333) and also a link from there to this thread > > Jacques > From: Deyan Tsvetanov > Right ! ;) > > So image no. 1: > http://img179.yfrog.com/img179/8520/translogscreenshot.png > > Image no. 2: > http://img263.yfrog.com/img263/8302/translogexeclogscreensh.png > > -- deyan > Ofbiz on twitter: http://twitter.com/apache_ofbiz Myself on twitter: http://twitter.com/hansbak Antwebsystems.com: Quality services for competitive rates. |
Administrator
|
In reply to this post by David E. Jones-2
Hi David,
Thanks for you feedback in your 2 last messages. From what I have understood so far, here is an as simple as possible how to, I'd like to put in the wiki. 1st, I will suggest to follow the Sync Setup Notes and Example page in wiki http://cwiki.apache.org/confluence/x/bgFk. I will try to clarify it though. Notably will adapt or move (or even remove) the "Fixing Timestamps issues" section I previoulsy hastily added as a quick help. Then I will add following tips to use in case of problems (chances are they will appear) Syncing issues If, for any reasons, the syncing has failed and did not recover automatically, you will need to check the status and reset it. For that, use the Entity Sync Status feature in Webtools. If it's still in running state, put it back to not started status in order for the job to pick things again. Note that these jobs should run on the client(s) only and you should not run multi-instances of them. TimeStamp issues 1) Fix the timezone on all machines to be the same (sorry if this is an issue but I can't find another way) 2) use an (preferably the same) NTP server to adjust the time on each machine. Remember that OOTB the minimun period of time used is 5 mins for push (1 hour for pull). So you get any issues as long as your machines have the "same time" between this window. 3) If you get an issue with foreign keyd because the NTP server adjusted time inside a transaction, use the Check/Update Database feature in Webtools I will also put a link to https://issues.apache.org/jira/browse/OFBIZ-3333 where there are suggestion for a new way and a link back to this thread All comments are welcome Jacques From: "David E Jones" <[hidden email]> > A transaction has a lot of overhead, but is a good way to go. For removes the current EntitySync implementation does in fact use a > sort of transaction log. The nice thing about timestamps is they tell you what records have been changed since a certain time so > you know which records have changes that need to be updated since the last update time. There are issues with multiple app servers > if their times are out of sync by more than a few minutes. Most other issues seem to be related to people not understanding how > this works and how to handle different issues. Of course, that alone can be fatal flaw in a design and probably is in this case. > > To really do this you'd have to have a more structured record of the changes than just recording an SQL statement if you want to > support syncing from one type of database to another (which is currently supported). > > -David > > > On Apr 7, 2010, at 3:41 AM, Deyan Tsvetanov wrote: > >> Hi guys, >> >> so about the transaction log: >> >> I have a table called TRANSLOG with the following structure: >> >> >> CREATE TABLE translog >> ( >> id bigserial NOT NULL, >> "table" character varying(30) NOT NULL, >> operation character varying(6) NOT NULL, >> "statement" character varying NOT NULL, >> created_on timestamp with time zone NOT NULL DEFAULT now(), >> CONSTRAINT id_fk PRIMARY KEY (id) >> ) >> >> Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE >> and DELETE. >> The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot shows >> some data in the TRANSLOG table: >> >> >> >> So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry. >> >> Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the >> statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has occured. >> >> The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz. >> For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached >> translog_id so we know from where to start the next time >> when the sync SP is invoked. >> >> So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant . Here's >> the screenshot of TRANSLOG_EXEC_LOG: >> >> >> >> The result message could be either OK or the DB error message - either a constraint error or some other error. We always know >> which was the last successfully executed statement ( >> last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last >> successful one. >> >> That's all in general. >> >> -- deyan >> >> -----Original Message----- >> From: Jacques Le Roux <[hidden email]> >> Reply-to: "Jacques Le Roux" <[hidden email]> >> To: Deyan Tsvetanov <[hidden email]>, [hidden email] >> Cc: Hans Bakker <[hidden email]>, [hidden email] >> Subject: Re: ofbiz entity sync. >> Date: Tue, 6 Apr 2010 12:31:49 +0200 >> >> Hi Deyan, >> >> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in >> OFBIZ-3333 . >> >> Jacques >> >> ----- Original Message ----- >> From: Deyan Tsvetanov >> To: Jacques Le Roux >> Cc: Hans Bakker ; [hidden email] >> Sent: Tuesday, April 06, 2010 9:42 AM >> Subject: Re: ofbiz entity sync. >> >> >> Hi Sirs, >> >> I'm sorry, i got lost for some time :) >> Things happen :) >> >> There was a lot of discussion on the topic, but the summary: >> >> there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection >> between the two servers, sync speed - RMI can be removed , etc. >> >> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be >> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the >> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of >> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in >> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key constraint >> issues. Examples I could give but I guess you could think of such by yourselves :) >> >> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major >> databases - m$ $ql, oracle. >> >> For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the >> stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or >> oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My >> goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty >> busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work >> out the world financial crisis :) >> >> So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports >> triggers - so mysql won't work :) >> >> That was just the first part. >> >> The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them >> pushed only and some of them synced in both directions. Next you need to test, reset the database and test again and again until >> you identify the correct list of the tables so your sync process doesn't end up with FK insert / update errors. That is pretty >> easy but time consuming task - it takes few days to complete :) >> >> So that's all I can say for now, without getting your bored with details :) >> If you have other questions - go ahead :) >> >> Cheers, >> Deyan >> >> -----Original Message----- >> From: Jacques Le Roux <[hidden email]> >> Reply-to: "Jacques Le Roux" <[hidden email]> >> To: Hans Bakker <[hidden email]>, [hidden email], [hidden email] >> Subject: Re: ofbiz entity sync. >> Date: Sat, 3 Apr 2010 10:04:29 +0200 >> >> Hi Hans, >> >> I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling >> >> Thanks >> >> Jacques >> >> From: "Hans Bakker" < >> [hidden email] >> > >> > Hi Gentleman, >> > >> > may i ask if there was any progress on this issue? >> > >> > >> https://issues.apache.org/jira/browse/OFBIZ-3333 >> >> > >> > I added the following comment: >> > >> > We have a customer using entity sync and experiencing the problems in >> > this issue. >> > >> > We are interested working with other parties on this subject. If there >> > is already substantial work done we are interested to compensate >> > financially for it. >> > >> > Please let me know if there is any progress here. >> > >> > regards, >> > Hans >> > >> > -- >> > >> http://www.antwebsystems.com >> : >> > Quality OFBiz support for competitive rates.... >> > >> > >> >> >> >> > > |
Free forum by Nabble | Edit this page |