Re: ofbiz entity sync.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz entity sync.

Jacques Le Roux
Administrator
Done, though I did not refactor all yet has I wanted

Jacques

From: "Jacques Le Roux" <[hidden email]>

> Hi David,
>
> Thanks for you feedback in your 2 last messages. From what I have understood so far, here is an as simple as possible how to, I'd
> like to put in the wiki.
>
> 1st, I will suggest to follow the Sync Setup Notes and Example page in wiki http://cwiki.apache.org/confluence/x/bgFk.
> I will try to clarify it though. Notably will adapt or move (or even remove) the "Fixing Timestamps issues" section I previoulsy
> hastily added as a quick help.
>
> Then I will add following tips to use in case of problems (chances are they will appear)
>
> Syncing issues
> If, for any reasons, the syncing has failed and did not recover automatically, you will need to check the status and reset it. For
> that, use the Entity Sync Status feature in Webtools. If it's still in running state, put it back to not started status in order
> for the job to pick things again. Note that these jobs should run on the client(s) only and you should not run multi-instances of
> them.
>
> TimeStamp issues
> 1) Fix the timezone on all machines to be the same (sorry if this is an issue but I can't find another way)
> 2) use an (preferably the same) NTP server to adjust the time on each machine. Remember that OOTB the minimun period of time used
> is 5 mins for push (1 hour for pull). So you get any issues as long as your machines have the "same time" between this window.
> 3) If you get an issue with foreign keyd because the NTP server adjusted time inside a transaction, use the Check/Update Database
> feature in Webtools
>
> I will also put a link to https://issues.apache.org/jira/browse/OFBIZ-3333 where there are suggestion for a new way and a link
> back to this thread
>
> All comments are welcome
>
> Jacques
>
> From: "David E Jones" <[hidden email]>
>> A transaction has a lot of overhead, but is a good way to go. For removes the current EntitySync implementation does in fact use
>> a sort of transaction log. The nice thing about timestamps is they tell you what records have been changed since a certain time
>> so you know which records have changes that need to be updated since the last update time. There are issues with multiple app
>> servers if their times are out of sync by more than a few minutes. Most other issues seem to be related to people not
>> understanding how this works and how to handle different issues. Of course, that alone can be fatal flaw in a design and probably
>> is in this case.
>>
>> To really do this you'd have to have a more structured record of the changes than just recording an SQL statement if you want to
>> support syncing from one type of database to another (which is currently supported).
>>
>> -David
>>
>>
>> On Apr 7, 2010, at 3:41 AM, Deyan Tsvetanov wrote:
>>
>>> Hi guys,
>>>
>>> so about the transaction log:
>>>
>>> I have a table called TRANSLOG with the following structure:
>>>
>>>
>>> CREATE TABLE translog
>>> (
>>>   id bigserial NOT NULL,
>>>   "table" character varying(30) NOT NULL,
>>>   operation character varying(6) NOT NULL,
>>>   "statement" character varying NOT NULL,
>>>   created_on timestamp with time zone NOT NULL DEFAULT now(),
>>>   CONSTRAINT id_fk PRIMARY KEY (id)
>>> )
>>>
>>> Records in the TRANSLOG table are being inserted by the triggers for the table we are "watching" for changes on INSERT, UPDATE
>>> and DELETE.
>>> The STATEMENT column contains executable SQL statement which is generated by the trigger function. The following screenshot
>>> shows some data in the TRANSLOG table:
>>>
>>>
>>>
>>> So every time some database action is performed on the tables we are "watching" we get a TRANSLOG entry.
>>>
>>> Each translog entry has a sequence ID. So when we want to PUSH that transaction log to the main server we actually execute the
>>> statements ON the main server. We execute the statements wrapped in a transaction which gets rolled back if an error has
>>> occured.
>>>
>>> The translog push gets invoked by a stored procedure which gets called by a scheduled reoccuring job in ofbiz.
>>> For each sync SP execution we store a record in a table called TRANSLOG_EXEC_LOG . There we keep the currently reached
>>> translog_id so we know from where to start the next time
>>> when  the sync SP is invoked.
>>>
>>> So in general that's the mechanism. Pretty simple to implement, pretty simple to troubleshoot and pretty error resistant .
>>> Here's the screenshot of TRANSLOG_EXEC_LOG:
>>>
>>>
>>>
>>> The result message could be either OK or the DB error message - either a constraint error or some other error. We always know
>>> which was the last successfully executed statement (
>>> last_command_id ). If there is an error then we know at which statement the error has happened - the next after the last
>>> successful one.
>>>
>>> That's all in general.
>>>
>>> -- deyan
>>>
>>> -----Original Message-----
>>> From: Jacques Le Roux <[hidden email]>
>>> Reply-to: "Jacques Le Roux" <[hidden email]>
>>> To: Deyan Tsvetanov <[hidden email]>, [hidden email]
>>> Cc: Hans Bakker <[hidden email]>, [hidden email]
>>> Subject: Re: ofbiz entity sync.
>>> Date: Tue, 6 Apr 2010 12:31:49 +0200
>>>
>>>  Hi Deyan,
>>>
>>> Thanks for your clear explanation and suggestion. As I'm busy with another stuff, I have quickly put your comment as a quote in
>>> OFBIZ-3333 .
>>>
>>> Jacques
>>>
>>> ----- Original Message -----
>>> From: Deyan Tsvetanov
>>> To: Jacques Le Roux
>>> Cc: Hans Bakker ; [hidden email]
>>> Sent: Tuesday, April 06, 2010 9:42 AM
>>> Subject: Re: ofbiz entity sync.
>>>
>>>
>>> Hi Sirs,
>>>
>>> I'm sorry, i got lost for some time :)
>>> Things happen :)
>>>
>>> There was a lot of discussion on the topic, but the summary:
>>>
>>> there are some major issues that CAN be fixed in the current implementation: error recovery - due to interrupted connection
>>> between the two servers, sync speed - RMI can be removed , etc.
>>>
>>> There is however a major issue that CAN NOT be fixed in the current implementation: the list and sequence of entities to be
>>> synchronized gets created by entities' timestamp - date_created_tx and last_update_tx. It works as long as the clocks of all the
>>> syncing parties are in sync. You can easily achieve this by using NTP for example - reliable enough. But if the clock of one of
>>> the parties gets un-synced for just few minutes, and during those few minutes records get inserted or updated than you are in
>>> trouble. Syncing the clock back won't help you because you won't be able to sync the broken records due to foreign key
>>> constraint issues. Examples I could give but I guess you could think of such by yourselves :)
>>>
>>> So IMHO the best approach for synchronization is not the timestamp but the TRANSACTION LOG. This approach is used in all major
>>> databases - m$ $ql, oracle.
>>>
>>> For a customer I've implemented a transaction log based on triggers and stored procedures. The transaction log, triggers and the
>>> stored procedures however I implemented only postgresql as that was the customer's database. It's easy to implement ms sql or
>>> oracle version though. It works perfectly, much much much faster than RMI, recovers if the sync process is interrupted , etc. My
>>> goal was to implement this mechanism using entity engine level triggers and eventually commit it, but unfortunately still pretty
>>> busy with other things so we don't have resources that can be dedicated to work on that task at the moment, we're trying to work
>>> out the world financial crisis :)
>>>
>>> So if you find what i say reasonable you could go ahead with the triggers and SPs. For that you need database that supports
>>> triggers - so mysql won't work :)
>>>
>>> That was just the first part.
>>>
>>> The second part is to identify all the tables that you need to synchronize. Some of them will be only pulled, some of them
>>> pushed only and some of them synced in both directions. Next you need to test, reset the database and  test again and again
>>> until you identify the correct list of the tables so your sync process doesn't end up with FK  insert / update errors. That is
>>> pretty easy but time consuming task - it takes few days to complete :)
>>>
>>> So that's all I can say for now, without getting your bored with details :)
>>> If you have other questions - go ahead :)
>>>
>>> Cheers,
>>> Deyan
>>>
>>> -----Original Message-----
>>> From: Jacques Le Roux <[hidden email]>
>>> Reply-to: "Jacques Le Roux" <[hidden email]>
>>> To: Hans Bakker <[hidden email]>, [hidden email], [hidden email]
>>> Subject: Re: ofbiz entity sync.
>>> Date: Sat, 3 Apr 2010 10:04:29 +0200
>>>
>>> Hi Hans,
>>>
>>> I put a comment with an abstract of the situation. Please let me know what you think to get the ball rolling
>>>
>>> Thanks
>>>
>>> Jacques
>>>
>>> From: "Hans Bakker" <
>>> [hidden email]
>>> >
>>> > Hi Gentleman,
>>> >
>>> > may i ask if there was any progress on this issue?
>>> >
>>> >
>>> https://issues.apache.org/jira/browse/OFBIZ-3333
>>>
>>> >
>>> > I added the following comment:
>>> >
>>> > We have a customer using entity sync and experiencing the problems in
>>> > this issue.
>>> >
>>> > We are interested working with other parties on this subject. If there
>>> > is already substantial work done we are interested to compensate
>>> > financially for it.
>>> >
>>> > Please let me know if there is any progress here.
>>> >
>>> > regards,
>>> > Hans
>>> >
>>> > --
>>> >
>>> http://www.antwebsystems.com
>>>  :
>>> > Quality OFBiz support for competitive rates....
>>> >
>>> >
>>>
>>>
>>>
>>>
>>
>>
>
>


12