Mirko Vogelsmeier created OFBIZ-4870:
---------------------------------------- Summary: Multithreading in GenericDAO / Delegator Key: OFBIZ-4870 URL: https://issues.apache.org/jira/browse/OFBIZ-4870 Project: OFBiz Issue Type: Improvement Reporter: Mirko Vogelsmeier Hey there, some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700). Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource. I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this? Greetings, Mirko -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
[ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272860#comment-13272860 ] Adrian Crum commented on OFBIZ-4870: ------------------------------------ I don't understand how one Delegator object per data source is a performance issue. I believe the real issue is multi-threading. A single object can scale well if it is designed properly. > Multithreading in GenericDAO / Delegator > ---------------------------------------- > > Key: OFBIZ-4870 > URL: https://issues.apache.org/jira/browse/OFBIZ-4870 > Project: OFBiz > Issue Type: Improvement > Reporter: Mirko Vogelsmeier > > Hey there, > some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700). > Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource. > I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this? > Greetings, > Mirko -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ] Adam Heath commented on OFBIZ-4870: ----------------------------------- There is multi-threading per-datasource during startup. Multiple threads creating the tables/keys/etc. Seed data(from xml) is still single-threaded. I have a series of changes to fix this locally, but it's *really* complex. Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread. The parsing, however, can be done in a background thread, in parallel. Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around. Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr). What threaded stuff are you still seeking? > Multithreading in GenericDAO / Delegator > ---------------------------------------- > > Key: OFBIZ-4870 > URL: https://issues.apache.org/jira/browse/OFBIZ-4870 > Project: OFBiz > Issue Type: Improvement > Reporter: Mirko Vogelsmeier > > Hey there, > some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700). > Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource. > I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this? > Greetings, > Mirko -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272868#comment-13272868 ] Adrian Crum commented on OFBIZ-4870: ------------------------------------ Adam, I know your question is for the reporter, but I would like for us to discuss the SEDA approach again - but on the dev mailing list and not here. > Multithreading in GenericDAO / Delegator > ---------------------------------------- > > Key: OFBIZ-4870 > URL: https://issues.apache.org/jira/browse/OFBIZ-4870 > Project: OFBiz > Issue Type: Improvement > Reporter: Mirko Vogelsmeier > > Hey there, > some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700). > Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource. > I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this? > Greetings, > Mirko -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira |
In reply to this post by Nicolas Malin (Jira)
I wanted to discuss this again.
Some time ago I modified my local copy of OFBiz to execute the the ant run-install task faster by using multi-threaded entity creation and data loading. I don't remember what i was working on at the time, but I needed the process to run much faster. Adam and I discussed my design and he said it sounded like it was similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). Actually, my approach was not that sophisticated, I just used a consumer-provider design pattern based on a FIFO queue. Anyway, based on that conversation, Adam committed the multi-threaded entity creation/data loading code (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E). The work Adam committed greatly improved the database creation/data loading time. In this Jira issue, Adam mentions how transactions are tied to a single thread. This is due to a fundamental weakness in OFBiz - the use of ThreadLocal variables. In order to truly remove the bottlenecks in OFBiz, we need to avoid the use of ThreadLocal variables - because they prohibit handing off tasks to other threads. The Execution Context proposed by David Jones some time ago is a step in the right direction, but it uses ThreadLocal variables too. So, we really need an object that is passed around the framework that represents a TASK state, not a THREAD state - so that tasks can be handed off to multiple threads. Transactions are tasks, so transaction state would be contained in the object, not in ThreadLocal variables. I believe that approach would solve the issue here. -Adrian On 5/10/2012 11:48 PM, Adam Heath (JIRA) wrote: > [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ] > > Adam Heath commented on OFBIZ-4870: > ----------------------------------- > > There is multi-threading per-datasource during startup. Multiple threads creating the tables/keys/etc. Seed data(from xml) is still single-threaded. > > I have a series of changes to fix this locally, but it's *really* complex. Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread. The parsing, however, can be done in a background thread, in parallel. Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around. Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr). > > What threaded stuff are you still seeking? > >> Multithreading in GenericDAO / Delegator >> ---------------------------------------- >> >> Key: OFBIZ-4870 >> URL: https://issues.apache.org/jira/browse/OFBIZ-4870 >> Project: OFBiz >> Issue Type: Improvement >> Reporter: Mirko Vogelsmeier >> >> Hey there, >> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700). >> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource. >> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this? >> Greetings, >> Mirko > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > |
On 05/10/2012 06:40 PM, Adrian Crum wrote:
> I wanted to discuss this again. > > Some time ago I modified my local copy of OFBiz to execute the the ant > run-install task faster by using multi-threaded entity creation and > data loading. I don't remember what i was working on at the time, but > I needed the process to run much faster. > > Adam and I discussed my design and he said it sounded like it was > similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). > Actually, my approach was not that sophisticated, I just used a > consumer-provider design pattern based on a FIFO queue. > > Anyway, based on that conversation, Adam committed the multi-threaded > entity creation/data loading code > (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E). > The work Adam committed greatly improved the database creation/data > loading time. > > In this Jira issue, Adam mentions how transactions are tied to a > single thread. This is due to a fundamental weakness in OFBiz - the > use of ThreadLocal variables. In order to truly remove the bottlenecks > in OFBiz, we need to avoid the use of ThreadLocal variables - because > they prohibit handing off tasks to other threads. The Execution > Context proposed by David Jones some time ago is a step in the right > direction, but it uses ThreadLocal variables too. My statement has *nothing* to do with ThreadLocal(well, at least not used by ofbiz). javax.transaction is for the *current* thread. When an xml import is requested, it *must* be done in the current thread, in the current transaction. It's not possible to suspend the transaction for the foreground thread, and resume it in the background(I tried). I may not have explained myself fully in the jira issue, or maybe you didn't understand. In any case, the rest of your explanation seems to not apply now. > So, we really need an object that is passed around the framework that > represents a TASK state, not a THREAD state - so that tasks can be > handed off to multiple threads. Transactions are tasks, so transaction > state would be contained in the object, not in ThreadLocal variables. > I believe that approach would solve the issue here. |
On 5/11/2012 12:45 AM, Adam Heath wrote:
> On 05/10/2012 06:40 PM, Adrian Crum wrote: >> I wanted to discuss this again. >> >> Some time ago I modified my local copy of OFBiz to execute the the ant >> run-install task faster by using multi-threaded entity creation and >> data loading. I don't remember what i was working on at the time, but >> I needed the process to run much faster. >> >> Adam and I discussed my design and he said it sounded like it was >> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). >> Actually, my approach was not that sophisticated, I just used a >> consumer-provider design pattern based on a FIFO queue. >> >> Anyway, based on that conversation, Adam committed the multi-threaded >> entity creation/data loading code >> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E). >> The work Adam committed greatly improved the database creation/data >> loading time. >> >> In this Jira issue, Adam mentions how transactions are tied to a >> single thread. This is due to a fundamental weakness in OFBiz - the >> use of ThreadLocal variables. In order to truly remove the bottlenecks >> in OFBiz, we need to avoid the use of ThreadLocal variables - because >> they prohibit handing off tasks to other threads. The Execution >> Context proposed by David Jones some time ago is a step in the right >> direction, but it uses ThreadLocal variables too. > My statement has *nothing* to do with ThreadLocal(well, at least not > used by ofbiz). javax.transaction is for the *current* thread. When > an xml import is requested, it *must* be done in the current thread, > in the current transaction. It's not possible to suspend the > transaction for the foreground thread, and resume it in the > background(I tried). > > I may not have explained myself fully in the jira issue, or maybe you > didn't understand. In any case, the rest of your explanation seems to > not apply now. I did not understand that the transaction-to-thread relationship was required in the javax.transaction API. That is a real problem. It makes me wonder how SEDA-style designs can work in transaction-based applications. -Adrian |
On 05/10/2012 06:58 PM, Adrian Crum wrote:
> On 5/11/2012 12:45 AM, Adam Heath wrote: >> On 05/10/2012 06:40 PM, Adrian Crum wrote: >>> I wanted to discuss this again. >>> >>> Some time ago I modified my local copy of OFBiz to execute the the ant >>> run-install task faster by using multi-threaded entity creation and >>> data loading. I don't remember what i was working on at the time, but >>> I needed the process to run much faster. >>> >>> Adam and I discussed my design and he said it sounded like it was >>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). >>> Actually, my approach was not that sophisticated, I just used a >>> consumer-provider design pattern based on a FIFO queue. >>> >>> Anyway, based on that conversation, Adam committed the multi-threaded >>> entity creation/data loading code >>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E). >>> >>> The work Adam committed greatly improved the database creation/data >>> loading time. >>> >>> In this Jira issue, Adam mentions how transactions are tied to a >>> single thread. This is due to a fundamental weakness in OFBiz - the >>> use of ThreadLocal variables. In order to truly remove the bottlenecks >>> in OFBiz, we need to avoid the use of ThreadLocal variables - because >>> they prohibit handing off tasks to other threads. The Execution >>> Context proposed by David Jones some time ago is a step in the right >>> direction, but it uses ThreadLocal variables too. >> My statement has *nothing* to do with ThreadLocal(well, at least not >> used by ofbiz). javax.transaction is for the *current* thread. When >> an xml import is requested, it *must* be done in the current thread, >> in the current transaction. It's not possible to suspend the >> transaction for the foreground thread, and resume it in the >> background(I tried). >> >> I may not have explained myself fully in the jira issue, or maybe you >> didn't understand. In any case, the rest of your explanation seems to >> not apply now. > > I did not understand that the transaction-to-thread relationship was > required in the javax.transaction API. That is a real problem. It makes > me wonder how SEDA-style designs can work in transaction-based > applications. In this case, you also can't split each row in the xml into a separate threead(transaction deadlock), nor into batches in separate threads(postgresql has delayed foreign key resolution). In a more general sense, javax.transaction has resource callbacks that happen at various transaction states. Those have to run in the foreground. In a more general sense, the foreground thread made have added callbacks to the general purpose container. Said container needs to be careful about which parts it puts into the background. For instance, UtilCache has support for listeners when things get added/removed. Those also should be run in the foreground. The only parts that can be pushed onto other work systems must be *completely* controlled. *No* abstact classes or interfaces, or otherwise the control code can't guarantee what is happening. This generalization is the same as the locking requirements talked about by Java Concurrency in Practice. If you really want to improve threading in ofbiz, a good first start would be removing 'synchronized' from places. But that is tricky if you haven't done it before. I've got several commits locally that remove synchronized from lots of places, but need time to finish them off. |
On 5/11/2012 1:28 AM, Adam Heath wrote:
> On 05/10/2012 06:58 PM, Adrian Crum wrote: >> On 5/11/2012 12:45 AM, Adam Heath wrote: >>> On 05/10/2012 06:40 PM, Adrian Crum wrote: >>>> I wanted to discuss this again. >>>> >>>> Some time ago I modified my local copy of OFBiz to execute the the ant >>>> run-install task faster by using multi-threaded entity creation and >>>> data loading. I don't remember what i was working on at the time, but >>>> I needed the process to run much faster. >>>> >>>> Adam and I discussed my design and he said it sounded like it was >>>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). >>>> Actually, my approach was not that sophisticated, I just used a >>>> consumer-provider design pattern based on a FIFO queue. >>>> >>>> Anyway, based on that conversation, Adam committed the multi-threaded >>>> entity creation/data loading code >>>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E). >>>> >>>> >>>> The work Adam committed greatly improved the database creation/data >>>> loading time. >>>> >>>> In this Jira issue, Adam mentions how transactions are tied to a >>>> single thread. This is due to a fundamental weakness in OFBiz - the >>>> use of ThreadLocal variables. In order to truly remove the bottlenecks >>>> in OFBiz, we need to avoid the use of ThreadLocal variables - because >>>> they prohibit handing off tasks to other threads. The Execution >>>> Context proposed by David Jones some time ago is a step in the right >>>> direction, but it uses ThreadLocal variables too. >>> My statement has *nothing* to do with ThreadLocal(well, at least not >>> used by ofbiz). javax.transaction is for the *current* thread. When >>> an xml import is requested, it *must* be done in the current thread, >>> in the current transaction. It's not possible to suspend the >>> transaction for the foreground thread, and resume it in the >>> background(I tried). >>> >>> I may not have explained myself fully in the jira issue, or maybe you >>> didn't understand. In any case, the rest of your explanation seems to >>> not apply now. >> >> I did not understand that the transaction-to-thread relationship was >> required in the javax.transaction API. That is a real problem. It makes >> me wonder how SEDA-style designs can work in transaction-based >> applications. > > In this case, you also can't split each row in the xml into a separate > threead(transaction deadlock), nor into batches in separate > threads(postgresql has delayed foreign key resolution). > > In a more general sense, javax.transaction has resource callbacks that > happen at various transaction states. Those have to run in the > foreground. > > In a more general sense, the foreground thread made have added > callbacks to the general purpose container. Said container needs to > be careful about which parts it puts into the background. For > instance, UtilCache has support for listeners when things get > added/removed. Those also should be run in the foreground. > > The only parts that can be pushed onto other work systems must be > *completely* controlled. *No* abstact classes or interfaces, or > otherwise the control code can't guarantee what is happening. This > generalization is the same as the locking requirements talked about by > Java Concurrency in Practice. > > If you really want to improve threading in ofbiz, a good first start > would be removing 'synchronized' from places. But that is tricky if > you haven't done it before. I've got several commits locally that > remove synchronized from lots of places, but need time to finish them > off. Btw, I wasn't trying to improve the threading, I was trying to improve responsiveness under heavy load. The feedback loop in SEDA prevents a user from encountering a request that seems to lock-up or timeout due to a busy server. -Adrian |
Free forum by Nabble | Edit this page |