Background:
There are a number of data loaders in Ofbiz including "seed" and "seed-initial". We use "seed-initial" as part of the database creation which is applied to the data once one time only. We use "seed" to ensure that standard entities are inserted/updated each time Ofbiz starts (or for us as part of each customer migration). The implication of this is that "seed-initial" will be standard entities that are loaded, but we do not want to seed them each time (often because the user in Ofbiz may modify these entities). Requirement: We have situations where we have entities that need to be seeded as "seed-initial" but they are added after the fact and need to be applied to customer deployments. Here is a theoretical example, I put a ProductStore in my "seed-initial" and a customer provisions a deployment and gets seeded with that store. Later on we decide that we should seed two stores for all our customers so we create a second store in "seed-initial". The trouble is that we can not execute seed-initial again, and it does not make sense to put this second store in "seed" (because we do not want it to be updated on a future seed). Potential Solution: We considered the idea of making "seed-initial" strictly INSERT (not UPSERT as it is now) but we are leaning towards adding a "keyword" to the seed entities that allows that entity to "insert-only". The benefit of the second approach, is that we can can keep things like the product store "contextually close" to other seeded data preventing the split between "seed" and "seed-initial". Looking for some thoughts on this. Perhaps there is another way to get "insert-only" type entities to be seeded that I have not found or perhaps there exists some configuration for each data loader. |
Why not create an additional custom reader (or readers)? See: the
"ext" reader for an example. Also, the ext reader is where I put all of my customer-specific data, I don't touch seed-initial or seed. There's nothing special about seed-initial or seed except that they are explicitly referenced to in the stock ant targets. -Joe On Aug 25, 2009, at 11:13 AM, Bob Morley wrote: > > Background: > There are a number of data loaders in Ofbiz including "seed" and > "seed-initial". We use "seed-initial" as part of the database > creation > which is applied to the data once one time only. We use "seed" to > ensure > that standard entities are inserted/updated each time Ofbiz starts > (or for > us as part of each customer migration). > > The implication of this is that "seed-initial" will be standard > entities > that are loaded, but we do not want to seed them each time (often > because > the user in Ofbiz may modify these entities). > > Requirement: > We have situations where we have entities that need to be seeded as > "seed-initial" but they are added after the fact and need to be > applied to > customer deployments. Here is a theoretical example, I put a > ProductStore > in my "seed-initial" and a customer provisions a deployment and gets > seeded > with that store. Later on we decide that we should seed two stores > for all > our customers so we create a second store in "seed-initial". The > trouble is > that we can not execute seed-initial again, and it does not make > sense to > put this second store in "seed" (because we do not want it to be > updated on > a future seed). > > Potential Solution: > We considered the idea of making "seed-initial" strictly INSERT (not > UPSERT > as it is now) but we are leaning towards adding a "keyword" to the > seed > entities that allows that entity to "insert-only". The benefit of the > second approach, is that we can can keep things like the product store > "contextually close" to other seeded data preventing the split between > "seed" and "seed-initial". > > Looking for some thoughts on this. Perhaps there is another way to > get > "insert-only" type entities to be seeded that I have not found or > perhaps > there exists some configuration for each data loader. > -- > View this message in context: http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25136333.html > Sent from the OFBiz - Dev mailing list archive at Nabble.com. |
Hey Joe,
Right we do a similar thing (the actual data is in separate files) but it is the loader configuration from the ant target that is the issue here. It is really the insert/update behavior of all of the readers. If I have data that I want to be only seeded initially (based on that ant target) but I have added to it and require this to be applied to current installations I am stuck. If I put it in my "seed-initial" file then it will not be executed against an existing installation and if I put it in my "seed" file then it will be executed every time. Does this make sense or perhaps I am not explaining the issue very well?
|
Let me back up a bit - ignore the custom reader(s) comment for now.
It seems like there are several low-tech ways around the problem... why not update your "seed-initial" file with the new data for future installs, and for any existing installations, manually import just that new data using the webtools entity import screen. (simple copy / paste or load from a file you push out to them) -Joe On Aug 25, 2009, at 11:45 AM, Bob Morley wrote: > > Hey Joe, > > Right we do a similar thing (the actual data is in separate files) > but it is > the loader configuration from the ant target that is the issue here. > > It is really the insert/update behavior of all of the readers. If I > have > data that I want to be only seeded initially (based on that ant > target) but > I have added to it and require this to be applied to current > installations I > am stuck. If I put it in my "seed-initial" file then it will not be > executed against an existing installation and if I put it in my > "seed" file > then it will be executed every time. > > Does this make sense or perhaps I am not explaining the issue very > well? > > > Joe Eckard wrote: >> >> Why not create an additional custom reader (or readers)? See: the >> "ext" reader for an example. Also, the ext reader is where I put all >> of my customer-specific data, I don't touch seed-initial or seed. >> >> There's nothing special about seed-initial or seed except that they >> are explicitly referenced to in the stock ant targets. >> >> -Joe >> >> On Aug 25, 2009, at 11:13 AM, Bob Morley wrote: >> >>> >>> Background: >>> There are a number of data loaders in Ofbiz including "seed" and >>> "seed-initial". We use "seed-initial" as part of the database >>> creation >>> which is applied to the data once one time only. We use "seed" to >>> ensure >>> that standard entities are inserted/updated each time Ofbiz starts >>> (or for >>> us as part of each customer migration). >>> >>> The implication of this is that "seed-initial" will be standard >>> entities >>> that are loaded, but we do not want to seed them each time (often >>> because >>> the user in Ofbiz may modify these entities). >>> >>> Requirement: >>> We have situations where we have entities that need to be seeded as >>> "seed-initial" but they are added after the fact and need to be >>> applied to >>> customer deployments. Here is a theoretical example, I put a >>> ProductStore >>> in my "seed-initial" and a customer provisions a deployment and gets >>> seeded >>> with that store. Later on we decide that we should seed two stores >>> for all >>> our customers so we create a second store in "seed-initial". The >>> trouble is >>> that we can not execute seed-initial again, and it does not make >>> sense to >>> put this second store in "seed" (because we do not want it to be >>> updated on >>> a future seed). >>> >>> Potential Solution: >>> We considered the idea of making "seed-initial" strictly INSERT (not >>> UPSERT >>> as it is now) but we are leaning towards adding a "keyword" to the >>> seed >>> entities that allows that entity to "insert-only". The benefit of >>> the >>> second approach, is that we can can keep things like the product >>> store >>> "contextually close" to other seeded data preventing the split >>> between >>> "seed" and "seed-initial". >>> >>> Looking for some thoughts on this. Perhaps there is another way to >>> get >>> "insert-only" type entities to be seeded that I have not found or >>> perhaps >>> there exists some configuration for each data loader. >>> -- >>> View this message in context: >>> http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25136333.html >>> Sent from the OFBiz - Dev mailing list archive at Nabble.com. >> >> >> > > -- > View this message in context: http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25136933.html > Sent from the OFBiz - Dev mailing list archive at Nabble.com. |
Yep there are some solutions like that; however envision 1000 customers. We are building out a SaaS based solution built on Ofbiz, and have the infrastructure in place to handle re-provisioning the latest software and upgrading the multitude of databases (1 customer per database).
I believe what we do right now is that we schedule jobs via the Ofbiz job scheduler to perform the Ofbiz "seed" (one job per customer). Once this job is complete, it triggers re-provisioning of the customer to one of the virtual servers running the latest build. So what happens is that the customers naturally migrate from their rev 1 server to a new rev 2 server along with their automated database upgrade. Eventually, all customers have moved off of rev 1 and that virtual sever is decommissioned. Things work very well with the Ofbiz loaders, we just have this one little gap. :) If we feel this is something that should just be custom to us that is fine; but we felt that there may be value in an enhancement for the community in general.
|
There is a manual way to do this, ie make sure the data in the database and the file are how you want them to be. On the entity XML import page (https://demo.ofbiz.org/webtools/control/EntityImportReaders ) there is a checkbox called "Check Data Only (nothing changed in database)". If that is checked you'll see the differences between the data file and the database and you can resolve them manually, or not, as desired. This may or may not be helpful. It could be expanded into a sort of merge tool, but it certainly isn't that right now. The alternative you mentioned is interesting and might work out fine, ie have an option to only insert and not change existing records. -David On Aug 25, 2009, at 10:22 AM, Bob Morley wrote: > > Yep there are some solutions like that; however envision 1000 > customers. We > are building out a SaaS based solution built on Ofbiz, and have the > infrastructure in place to handle re-provisioning the latest > software and > upgrading the multitude of databases (1 customer per database). > > I believe what we do right now is that we schedule jobs via the > Ofbiz job > scheduler to perform the Ofbiz "seed" (one job per customer). Once > this job > is complete, it triggers re-provisioning of the customer to one of the > virtual servers running the latest build. So what happens is that the > customers naturally migrate from their rev 1 server to a new rev 2 > server > along with their automated database upgrade. Eventually, all > customers have > moved off of rev 1 and that virtual sever is decommissioned. > > Things work very well with the Ofbiz loaders, we just have this one > little > gap. :) If we feel this is something that should just be custom to > us that > is fine; but we felt that there may be value in an enhancement for the > community in general. > > > Joe Eckard wrote: >> >> Let me back up a bit - ignore the custom reader(s) comment for now. >> >> It seems like there are several low-tech ways around the problem... >> why not update your "seed-initial" file with the new data for future >> installs, and for any existing installations, manually import just >> that new data using the webtools entity import screen. (simple copy / >> paste or load from a file you push out to them) >> >> -Joe >> >> On Aug 25, 2009, at 11:45 AM, Bob Morley wrote: >> >>> >>> Hey Joe, >>> >>> Right we do a similar thing (the actual data is in separate files) >>> but it is >>> the loader configuration from the ant target that is the issue here. >>> >>> It is really the insert/update behavior of all of the readers. If I >>> have >>> data that I want to be only seeded initially (based on that ant >>> target) but >>> I have added to it and require this to be applied to current >>> installations I >>> am stuck. If I put it in my "seed-initial" file then it will not be >>> executed against an existing installation and if I put it in my >>> "seed" file >>> then it will be executed every time. >>> >>> Does this make sense or perhaps I am not explaining the issue very >>> well? >>> >>> >>> Joe Eckard wrote: >>>> >>>> Why not create an additional custom reader (or readers)? See: the >>>> "ext" reader for an example. Also, the ext reader is where I put >>>> all >>>> of my customer-specific data, I don't touch seed-initial or seed. >>>> >>>> There's nothing special about seed-initial or seed except that they >>>> are explicitly referenced to in the stock ant targets. >>>> >>>> -Joe >>>> >>>> On Aug 25, 2009, at 11:13 AM, Bob Morley wrote: >>>> >>>>> >>>>> Background: >>>>> There are a number of data loaders in Ofbiz including "seed" and >>>>> "seed-initial". We use "seed-initial" as part of the database >>>>> creation >>>>> which is applied to the data once one time only. We use "seed" to >>>>> ensure >>>>> that standard entities are inserted/updated each time Ofbiz starts >>>>> (or for >>>>> us as part of each customer migration). >>>>> >>>>> The implication of this is that "seed-initial" will be standard >>>>> entities >>>>> that are loaded, but we do not want to seed them each time (often >>>>> because >>>>> the user in Ofbiz may modify these entities). >>>>> >>>>> Requirement: >>>>> We have situations where we have entities that need to be seeded >>>>> as >>>>> "seed-initial" but they are added after the fact and need to be >>>>> applied to >>>>> customer deployments. Here is a theoretical example, I put a >>>>> ProductStore >>>>> in my "seed-initial" and a customer provisions a deployment and >>>>> gets >>>>> seeded >>>>> with that store. Later on we decide that we should seed two >>>>> stores >>>>> for all >>>>> our customers so we create a second store in "seed-initial". The >>>>> trouble is >>>>> that we can not execute seed-initial again, and it does not make >>>>> sense to >>>>> put this second store in "seed" (because we do not want it to be >>>>> updated on >>>>> a future seed). >>>>> >>>>> Potential Solution: >>>>> We considered the idea of making "seed-initial" strictly INSERT >>>>> (not >>>>> UPSERT >>>>> as it is now) but we are leaning towards adding a "keyword" to the >>>>> seed >>>>> entities that allows that entity to "insert-only". The benefit of >>>>> the >>>>> second approach, is that we can can keep things like the product >>>>> store >>>>> "contextually close" to other seeded data preventing the split >>>>> between >>>>> "seed" and "seed-initial". >>>>> >>>>> Looking for some thoughts on this. Perhaps there is another way >>>>> to >>>>> get >>>>> "insert-only" type entities to be seeded that I have not found or >>>>> perhaps >>>>> there exists some configuration for each data loader. >>>>> -- >>>>> View this message in context: >>>>> http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25136333.html >>>>> Sent from the OFBiz - Dev mailing list archive at Nabble.com. >>>> >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25136933.html >>> Sent from the OFBiz - Dev mailing list archive at Nabble.com. >> >> >> > > -- > View this message in context: http://www.nabble.com/Handling-one-time-seed-data-tp25136333p25137625.html > Sent from the OFBiz - Dev mailing list archive at Nabble.com. > |
Would it be able to detect auto-sequenced entities? Just thinking out
loud... On Aug 25, 2009, at 3:34 PM, David E Jones wrote: > > There is a manual way to do this, ie make sure the data in the > database and the file are how you want them to be. > > On the entity XML import page (https://demo.ofbiz.org/webtools/control/EntityImportReaders > ) there is a checkbox called "Check Data Only (nothing changed in > database)". If that is checked you'll see the differences between > the data file and the database and you can resolve them manually, or > not, as desired. > > This may or may not be helpful. It could be expanded into a sort of > merge tool, but it certainly isn't that right now. > > The alternative you mentioned is interesting and might work out > fine, ie have an option to only insert and not change existing > records. > > -David > > > On Aug 25, 2009, at 10:22 AM, Bob Morley wrote: > >> >> Yep there are some solutions like that; however envision 1000 >> customers. We >> are building out a SaaS based solution built on Ofbiz, and have the >> infrastructure in place to handle re-provisioning the latest >> software and >> upgrading the multitude of databases (1 customer per database). >> >> I believe what we do right now is that we schedule jobs via the >> Ofbiz job >> scheduler to perform the Ofbiz "seed" (one job per customer). Once >> this job >> is complete, it triggers re-provisioning of the customer to one of >> the >> virtual servers running the latest build. So what happens is that >> the >> customers naturally migrate from their rev 1 server to a new rev 2 >> server >> along with their automated database upgrade. Eventually, all >> customers have >> moved off of rev 1 and that virtual sever is decommissioned. >> >> Things work very well with the Ofbiz loaders, we just have this one >> little >> gap. :) If we feel this is something that should just be custom >> to us that >> is fine; but we felt that there may be value in an enhancement for >> the >> community in general. |
Hey Joe -- are you referring to entities that get a sequence id from the SequenceValueItem table? While not being an expert with the data loaders, I would have thought that they do not play a role in the data loading. I would have thought that we load the entity from the xml definition, get the primary keys from the entity definition, then do a look-up to see if the entity exists. If it does not -> INSERT and if it does -> UPDATE. (This is just a guess without digging into the code yet).
What I am proposing is either an attribute for the loader that indicates we only want to insert entities for that loader or a "keyword" attribute on the entity that indicates we only want to insert that entity. Am I wrong in my thinking on the auto-sequence entities?
|
Years ago I had suggested an attribute - something like
overwrite="true|false". It's in the old Undersun Jira. -Adrian Bob Morley wrote: > Hey Joe -- are you referring to entities that get a sequence id from the > SequenceValueItem table? While not being an expert with the data loaders, I > would have thought that they do not play a role in the data loading. I > would have thought that we load the entity from the xml definition, get the > primary keys from the entity definition, then do a look-up to see if the > entity exists. If it does not -> INSERT and if it does -> UPDATE. (This is > just a guess without digging into the code yet). > > What I am proposing is either an attribute for the loader that indicates we > only want to insert entities for that loader or a "keyword" attribute on the > entity that indicates we only want to insert that entity. > > Am I wrong in my thinking on the auto-sequence entities? > > > Joe Eckard wrote: >> Would it be able to detect auto-sequenced entities? Just thinking out >> loud... >> >> >> On Aug 25, 2009, at 3:34 PM, David E Jones wrote: >> >>> There is a manual way to do this, ie make sure the data in the >>> database and the file are how you want them to be. >>> >>> On the entity XML import page >>> (https://demo.ofbiz.org/webtools/control/EntityImportReaders >>> ) there is a checkbox called "Check Data Only (nothing changed in >>> database)". If that is checked you'll see the differences between >>> the data file and the database and you can resolve them manually, or >>> not, as desired. >>> >>> This may or may not be helpful. It could be expanded into a sort of >>> merge tool, but it certainly isn't that right now. >>> >>> The alternative you mentioned is interesting and might work out >>> fine, ie have an option to only insert and not change existing >>> records. >>> >>> -David >>> >>> >>> On Aug 25, 2009, at 10:22 AM, Bob Morley wrote: >>> >>>> Yep there are some solutions like that; however envision 1000 >>>> customers. We >>>> are building out a SaaS based solution built on Ofbiz, and have the >>>> infrastructure in place to handle re-provisioning the latest >>>> software and >>>> upgrading the multitude of databases (1 customer per database). >>>> >>>> I believe what we do right now is that we schedule jobs via the >>>> Ofbiz job >>>> scheduler to perform the Ofbiz "seed" (one job per customer). Once >>>> this job >>>> is complete, it triggers re-provisioning of the customer to one of >>>> the >>>> virtual servers running the latest build. So what happens is that >>>> the >>>> customers naturally migrate from their rev 1 server to a new rev 2 >>>> server >>>> along with their automated database upgrade. Eventually, all >>>> customers have >>>> moved off of rev 1 and that virtual sever is decommissioned. >>>> >>>> Things work very well with the Ofbiz loaders, we just have this one >>>> little >>>> gap. :) If we feel this is something that should just be custom >>>> to us that >>>> is fine; but we felt that there may be value in an enhancement for >>>> the >>>> community in general. >> > |
In reply to this post by Joe Eckard
No, it would only support records/elements with a fully-specified primary key. -David On Aug 25, 2009, at 2:42 PM, Joe Eckard wrote: > Would it be able to detect auto-sequenced entities? Just thinking > out loud... > > > On Aug 25, 2009, at 3:34 PM, David E Jones wrote: > >> >> There is a manual way to do this, ie make sure the data in the >> database and the file are how you want them to be. >> >> On the entity XML import page (https://demo.ofbiz.org/webtools/control/EntityImportReaders >> ) there is a checkbox called "Check Data Only (nothing changed in >> database)". If that is checked you'll see the differences between >> the data file and the database and you can resolve them manually, >> or not, as desired. >> >> This may or may not be helpful. It could be expanded into a sort of >> merge tool, but it certainly isn't that right now. >> >> The alternative you mentioned is interesting and might work out >> fine, ie have an option to only insert and not change existing >> records. >> >> -David >> >> >> On Aug 25, 2009, at 10:22 AM, Bob Morley wrote: >> >>> >>> Yep there are some solutions like that; however envision 1000 >>> customers. We >>> are building out a SaaS based solution built on Ofbiz, and have the >>> infrastructure in place to handle re-provisioning the latest >>> software and >>> upgrading the multitude of databases (1 customer per database). >>> >>> I believe what we do right now is that we schedule jobs via the >>> Ofbiz job >>> scheduler to perform the Ofbiz "seed" (one job per customer). >>> Once this job >>> is complete, it triggers re-provisioning of the customer to one of >>> the >>> virtual servers running the latest build. So what happens is that >>> the >>> customers naturally migrate from their rev 1 server to a new rev 2 >>> server >>> along with their automated database upgrade. Eventually, all >>> customers have >>> moved off of rev 1 and that virtual sever is decommissioned. >>> >>> Things work very well with the Ofbiz loaders, we just have this >>> one little >>> gap. :) If we feel this is something that should just be custom >>> to us that >>> is fine; but we felt that there may be value in an enhancement for >>> the >>> community in general. |
In reply to this post by Adrian Crum
I have created OFBIZ-2866 that contains some description and a patch. Here is my approach in brief (as brief as I can be)
Allow the entity-data-reader to indicate if that reader is going to allow inserts or updates (the default is true/true for my problem I would use true/false). These values get read by the EntityDataReaderInfo object and ultimately get bubbled down to the delegatorImpl (new storeAll method that takes the two new parameters). The biggest change here was that in the past we would gather up all URLs to process in a single generic List. The problem was at execution time I needed to know which EntityDataReaderInfo is associated with the URL. The change was to use a Map<EntityDataReaderInfo, List<URL>> -- this resulted in a few changes to the signatures and gathering techniques (but not major changes). Now it is possible to have URLs you want to process that are not associated with a EntityDataReaderInfo. This is in the (appears defunct) data source sql-load-path as well as if file/directory arguments are provided to start. In either case, I have instantiated a NO_ENTITY_DATA_READER_INFO EntityDataReaderInfo which uses the default true/true settings. When doing my testing I noticed that the metrics provided by the data load made no sense. The variables in the code were indicating that it was the "changedRows" but it was actually the number of rows that were read. I felt both were valuable, so I created a new object (EntityDataLoaderResults) which can contain all sorts of metrics -- right now just read / written. Message format was changed to show a nice set of read/written values with accumulation for the entire execution at the bottom. Found a bug in the DelegatorImpl where it was checking for "dirty" entities. What it was doing is taking the existing entity and the generic value loaded form the data source (typically xml file) and doing a .get on each of them to get the values (and comparing those values). Trouble is the .get method is a resource bundle aware method, so if one was to change a localized field it would not come up as dirty (and hence not persisted). Solution was to make "getFieldValue" public (from protected) and call this method in the delegatorImpl's dirty checking logic. Whew! I attached the patch to the JIRA ticket. In our branch I have applied this patch and changed our entityengine.xml to treat the seed-initial as true/false (insert only). This allows us to re-execute this seed-initial with each migration and only have new records added leaving any customer modified seed-initial records alone. Woo hoo!
|
Free forum by Nabble | Edit this page |