Chris, David,
Wow, this is getting complicated. 2 questions. This data import tool, can it do a data dump with foreign_key_checks turned off? I'd like to do a data import from XML, but not have to do the graph walk. Anybody knows of any tools to dump MySQL data (verbatim, with only platform-specific translations) into a PostgreSQL? I've looked into such "walk the graph/hierarchy" thing when I did my first data migration (not even data mapping, just dumping same structure to same structure). After spending a week to finally "get it!", I was slapped for wasting my time. Somebody did a "foreign_key_checks=0", dumped the data into database, and set the flag back to "on". I'd agree this is a tough nut to crack; haven't found anyone in the IT industry (other than PhDs doing database design theses?) who'd want to bother cracking this. About highly normalized data. Yes, I agree there are loops. In the real world, a Vendor can often be a Customer of another Vendor, who is a Customer of the first Vendor. Ah, you get the picture (I'm getting a headache). Jonathon Chris Howe wrote: > First, thank you very much for the feedback! Your comments have been > extremely helpful in me wrapping my brain around this nut. Comments > inline. > > --- "David E. Jones" <[hidden email]> wrote: > >> I'm not sure what you used to find loops in the graph but I know just >> off the top of my head that there are others, and I'm guessing quite >> a few others. Actually a highly normalized data model that fulfills a >> large number of different requirements probably tends to cause more >> loops. >> > > I put the logic at the end for critique > >> Some off the of my head real quick: >> >> - ProductCategory points to self > > All of the self referential issues have been accounted for. These are > found with a relationship of entity=relEntity and therefore records > must be ordered for the pass to be successful. > >> - ProductStore points to Facility, Facility points to ProductStore > > This is a one-nofk relationship. I assumed that this type of > relationship doesn't require maintaining referential integrity. Was > this assumption correct? If so, would this be a better relationship > type for UserLogin->Party? > >> Certain other ones are loops but are actually workable, like >> ProductCategory(1) -> ProductCategoryRollup -> ProductCategory(2) -> >> ProductCategoryRollup -> ProductCategory(1) which is okay because you >> can do the ProductCategory records first and then do the rollups. > > This is another one-nofk > >> One thing to consider is that there are certain loops that can >> possibly be resolved by sorting records instead of entities. That >> makes things significantly more complicated, but for certain >> scenarios it's the only solution. > > This is how the UserLogin -> Party would be handled assuming we don't > change the relationship to one-nofk :-) > >> If you really want to go for this and try to code something that will >> sort both entities and records within entities to get something that >> will handle most but not all cases, go for it, I won't stop you. >> >> I'm just warning you that I looked into this once and did the >> background work and found it to be a tough nut to crack, probably not >> worth the effort as other solutions are easier. Also, this has been >> tried before by other people who made some progress when they got >> into testing with real data and such ran into problem after problem >> and never found a workable solution. >> >> Maybe the best thing for me to do is shut up and let you give it a >> try. Let me know how it goes. > > I wish you wouldn't. The gotcha scenarios are very helpful in the > cracking process. If the gotchas can't ungotched...then warnings and > caveats can be put into the most workable solution so that it can be > left for someone else to be able to pick it up and only work on the > uncracked portions. > >> -David >> > > Two Entities: > 1) EntityMap [entity, insertionPass, insertionScenario] > 2) EntityRelationshipMap [primaryEntity, relatedEntity, > ignoreRelationshipFlag] > > Logic process: > Setup- > 1) Load all entities into EntityMaps; > 2) Load all relationships of type one into EntityRelationshipMaps; set > all ignoreRelationshipFlags to "N"; > > Locate relationship scenario A->B->A > 3) Iterate list looking for instances of primaryEntity_relatedEntity = > relatedEntity_primaryEntity. When found set ignoreRelationshipFlag to > "Y" and set EntityMap.insertionScenario for the primaryEntity and > relatedEntity to "ABA" > > Locate relationship scenario A->A > 4) Iterate list looking for instances where primaryEntity = > relatedEntity; set ignoreRelationshipFlag to "Y" and set > EntityMap.insertionScarion to "AA" > > Determine order > 5) Find all EntityMaps where insertionPass is null > 6) Iterate list; find the number of times it exists in the > EntityRelationshipMap where ignoreRelationshipFlag = "N" > 7) For all that return 0, set EntityMaps.insertionPass to current pass > number; find all records where the current entity exists in > EntityRelationshipMap.relatedEntity and set > EntityRelationshipMap.ignoreRelationshipFlag to "Y" > 8) increment the current pass; go to 5 until no more exist. > > |
Comments inline...
--- Jonathon -- Improov <[hidden email]> wrote: > Chris, David, > > Wow, this is getting complicated. > > 2 questions. > > This data import tool, can it do a data dump with foreign_key_checks > turned off? I'd like to do a > data import from XML, but not have to do the graph walk. > If by this tool, you mean the one currently in OFBiz, it does it by using a "insert dummy key" indicator. It's only recommended if you know your data is clean. As I've mentioned previously, I tried using this to go from MySQL to Postgres and it did not work. I did not look into why. There are a multitude of reasons why this may not have worked. 95% of those reasons are on this side of the keyboard as I've touched the data myself, so it likely isn't clean. > Anybody knows of any tools to dump MySQL data (verbatim, with only > platform-specific translations) > into a PostgreSQL? > There are quite a number of solutions. A google search should find them. I didn't see any free choices when I looked. > I've looked into such "walk the graph/hierarchy" thing when I did my > first data migration (not > even data mapping, just dumping same structure to same structure). > After spending a week to > finally "get it!", I was slapped for wasting my time. Somebody did a > "foreign_key_checks=0", > dumped the data into database, and set the flag back to "on". I'd > agree this is a tough nut to > crack; haven't found anyone in the IT industry (other than PhDs doing > database design theses?) > who'd want to bother cracking this. > approach, you have to be certain that you data is clean. Depending on how old your data is in OFBiz, this may be unlikely. For instance, before case sensitive logins were implemented as the default the case in which the login was typed is the case that was stored in the various entities that have a relationship to UserLogin. I believe, MySQL in the default installation mode treats "ofbiz" and "OFBIZ" the same and so found related records as expected. Postgres in the default installation mode does not treat the two as the same and your application will not work as expected (I'm dealing with the fallout of this at the moment). > About highly normalized data. Yes, I agree there are loops. In the > real world, a Vendor can often > be a Customer of another Vendor, who is a Customer of the first > Vendor. Ah, you get the picture > (I'm getting a headache). > Because OFBiz is designed for the most part to be a generic data model, this case scenario doesn't create a loop for referential integrity. In this case, there are four entities of importance; RoleType, Party, PartyRole and PartyRelationship. RoleType and Party do not rely on the later two, so they get entered cleanly. PartyRole only depends on RoleType and Party, and since all of those records are already in, it gets entered cleanly. PartyRelationship relies on PartyRole and Party and since all of those records are already in, they get entered cleanly. > Jonathon > |
There is also a tool as I mentioned before to remove and re-add all foreign keys. It is on the Check/Update page in WebTools. -David On Apr 15, 2007, at 9:40 PM, Chris Howe wrote: > Comments inline... > --- Jonathon -- Improov <[hidden email]> wrote: > >> Chris, David, >> >> Wow, this is getting complicated. >> >> 2 questions. >> >> This data import tool, can it do a data dump with foreign_key_checks >> turned off? I'd like to do a >> data import from XML, but not have to do the graph walk. >> > > If by this tool, you mean the one currently in OFBiz, it does it by > using a "insert dummy key" indicator. It's only recommended if you > know your data is clean. As I've mentioned previously, I tried using > this to go from MySQL to Postgres and it did not work. I did not look > into why. There are a multitude of reasons why this may not have > worked. 95% of those reasons are on this side of the keyboard as I've > touched the data myself, so it likely isn't clean. > >> Anybody knows of any tools to dump MySQL data (verbatim, with only >> platform-specific translations) >> into a PostgreSQL? >> > > There are quite a number of solutions. A google search should find > them. I didn't see any free choices when I looked. > >> I've looked into such "walk the graph/hierarchy" thing when I did my >> first data migration (not >> even data mapping, just dumping same structure to same structure). >> After spending a week to >> finally "get it!", I was slapped for wasting my time. Somebody did a >> "foreign_key_checks=0", >> dumped the data into database, and set the flag back to "on". I'd >> agree this is a tough nut to >> crack; haven't found anyone in the IT industry (other than PhDs doing >> database design theses?) >> who'd want to bother cracking this. >> > That would have the same use scenario with the insert dummy key > approach, you have to be certain that you data is clean. Depending on > how old your data is in OFBiz, this may be unlikely. For instance, > before case sensitive logins were implemented as the default the case > in which the login was typed is the case that was stored in the > various > entities that have a relationship to UserLogin. I believe, MySQL in > the default installation mode treats "ofbiz" and "OFBIZ" the same and > so found related records as expected. Postgres in the default > installation mode does not treat the two as the same and your > application will not work as expected (I'm dealing with the fallout of > this at the moment). > > >> About highly normalized data. Yes, I agree there are loops. In the >> real world, a Vendor can often >> be a Customer of another Vendor, who is a Customer of the first >> Vendor. Ah, you get the picture >> (I'm getting a headache). >> > > Because OFBiz is designed for the most part to be a generic data > model, > this case scenario doesn't create a loop for referential > integrity. In > this case, there are four entities of importance; RoleType, Party, > PartyRole and PartyRelationship. RoleType and Party do not rely on > the > later two, so they get entered cleanly. PartyRole only depends on > RoleType and Party, and since all of those records are already in, it > gets entered cleanly. PartyRelationship relies on PartyRole and Party > and since all of those records are already in, they get entered > cleanly. > >> Jonathon >> > smime.p7s (3K) Download Attachment |
In reply to this post by cjhowe
Chris,
> you have to be certain that you data is clean. Depending on how old your > data is in OFBiz, this may be unlikely. You just reminded me that my existing data can still get "unclean" simply because new OFBiz updates changes the data structure. Adding another item to my SOP for "Importing OFBiz updates": Check changes to entity models, and perform any necessary data-mapping and data-migration. > I believe, MySQL in the default installation mode treats "ofbiz" and "OFBIZ" > the same and so found related records as expected. If you're talking about table names, I think the default installation on Windows treats user_login and USER_LOGIN the same. But the default on Linux is totally case-sensitive. If you're talking about database collation, there really isn't much in the way of "default". We set our own. In general, I always opt for _cs (case sensitive), never for _ci. The default in OFBiz's entityengine.xml is latin1_general_cs; I've been using that for all OFBiz installations. > Postgres in the default installation mode does not treat the two as the same > and your application will not work as expected (I'm dealing with the fallout > of this at the moment). Convert all affected keys to lower case, and tell all users about the change? Jonathon Chris Howe wrote: > Comments inline... > --- Jonathon -- Improov <[hidden email]> wrote: > >> Chris, David, >> >> Wow, this is getting complicated. >> >> 2 questions. >> >> This data import tool, can it do a data dump with foreign_key_checks >> turned off? I'd like to do a >> data import from XML, but not have to do the graph walk. >> > > If by this tool, you mean the one currently in OFBiz, it does it by > using a "insert dummy key" indicator. It's only recommended if you > know your data is clean. As I've mentioned previously, I tried using > this to go from MySQL to Postgres and it did not work. I did not look > into why. There are a multitude of reasons why this may not have > worked. 95% of those reasons are on this side of the keyboard as I've > touched the data myself, so it likely isn't clean. > >> Anybody knows of any tools to dump MySQL data (verbatim, with only >> platform-specific translations) >> into a PostgreSQL? >> > > There are quite a number of solutions. A google search should find > them. I didn't see any free choices when I looked. > >> I've looked into such "walk the graph/hierarchy" thing when I did my >> first data migration (not >> even data mapping, just dumping same structure to same structure). >> After spending a week to >> finally "get it!", I was slapped for wasting my time. Somebody did a >> "foreign_key_checks=0", >> dumped the data into database, and set the flag back to "on". I'd >> agree this is a tough nut to >> crack; haven't found anyone in the IT industry (other than PhDs doing >> database design theses?) >> who'd want to bother cracking this. >> > That would have the same use scenario with the insert dummy key > approach, you have to be certain that you data is clean. Depending on > how old your data is in OFBiz, this may be unlikely. For instance, > before case sensitive logins were implemented as the default the case > in which the login was typed is the case that was stored in the various > entities that have a relationship to UserLogin. I believe, MySQL in > the default installation mode treats "ofbiz" and "OFBIZ" the same and > so found related records as expected. Postgres in the default > installation mode does not treat the two as the same and your > application will not work as expected (I'm dealing with the fallout of > this at the moment). > > >> About highly normalized data. Yes, I agree there are loops. In the >> real world, a Vendor can often >> be a Customer of another Vendor, who is a Customer of the first >> Vendor. Ah, you get the picture >> (I'm getting a headache). >> > > Because OFBiz is designed for the most part to be a generic data model, > this case scenario doesn't create a loop for referential integrity. In > this case, there are four entities of importance; RoleType, Party, > PartyRole and PartyRelationship. RoleType and Party do not rely on the > later two, so they get entered cleanly. PartyRole only depends on > RoleType and Party, and since all of those records are already in, it > gets entered cleanly. PartyRelationship relies on PartyRole and Party > and since all of those records are already in, they get entered > cleanly. > >> Jonathon >> > > |
Administrator
|
In reply to this post by David E Jones
And it worked well for me, in such case you really love it :o)
Jacques > > There is also a tool as I mentioned before to remove and re-add all > foreign keys. It is on the Check/Update page in WebTools. > > -David > > > On Apr 15, 2007, at 9:40 PM, Chris Howe wrote: > > > Comments inline... > > --- Jonathon -- Improov <[hidden email]> wrote: > > > >> Chris, David, > >> > >> Wow, this is getting complicated. > >> > >> 2 questions. > >> > >> This data import tool, can it do a data dump with > >> turned off? I'd like to do a > >> data import from XML, but not have to do the graph walk. > >> > > > > If by this tool, you mean the one currently in OFBiz, it does it by > > using a "insert dummy key" indicator. It's only recommended if you > > know your data is clean. As I've mentioned previously, I tried using > > this to go from MySQL to Postgres and it did not work. I did not look > > into why. There are a multitude of reasons why this may not have > > worked. 95% of those reasons are on this side of the keyboard as I've > > touched the data myself, so it likely isn't clean. > > > >> Anybody knows of any tools to dump MySQL data (verbatim, with only > >> platform-specific translations) > >> into a PostgreSQL? > >> > > > > There are quite a number of solutions. A google search should find > > them. I didn't see any free choices when I looked. > > > >> I've looked into such "walk the graph/hierarchy" thing when I did > >> first data migration (not > >> even data mapping, just dumping same structure to same structure). > >> After spending a week to > >> finally "get it!", I was slapped for wasting my time. Somebody did a > >> "foreign_key_checks=0", > >> dumped the data into database, and set the flag back to "on". I'd > >> agree this is a tough nut to > >> crack; haven't found anyone in the IT industry (other than PhDs doing > >> database design theses?) > >> who'd want to bother cracking this. > >> > > That would have the same use scenario with the insert dummy key > > approach, you have to be certain that you data is clean. Depending on > > how old your data is in OFBiz, this may be unlikely. For instance, > > before case sensitive logins were implemented as the default the case > > in which the login was typed is the case that was stored in the > > various > > entities that have a relationship to UserLogin. I believe, MySQL in > > the default installation mode treats "ofbiz" and "OFBIZ" the same and > > so found related records as expected. Postgres in the default > > installation mode does not treat the two as the same and your > > application will not work as expected (I'm dealing with the fallout of > > this at the moment). > > > > > >> About highly normalized data. Yes, I agree there are loops. In the > >> real world, a Vendor can often > >> be a Customer of another Vendor, who is a Customer of the first > >> Vendor. Ah, you get the picture > >> (I'm getting a headache). > >> > > > > Because OFBiz is designed for the most part to be a generic data > > model, > > this case scenario doesn't create a loop for referential > > integrity. In > > this case, there are four entities of importance; RoleType, Party, > > PartyRole and PartyRelationship. RoleType and Party do not rely on > > the > > later two, so they get entered cleanly. PartyRole only depends on > > RoleType and Party, and since all of those records are already in, > > gets entered cleanly. PartyRelationship relies on PartyRole and Party > > and since all of those records are already in, they get entered > > cleanly. > > > >> Jonathon > >> > > > > |
Free forum by Nabble | Edit this page |