[jira] [Created] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Nicolas Malin (Jira)
Mirko Vogelsmeier created OFBIZ-4870:
----------------------------------------

             Summary: Multithreading in GenericDAO / Delegator
                 Key: OFBIZ-4870
                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
             Project: OFBiz
          Issue Type: Improvement
            Reporter: Mirko Vogelsmeier


Hey there,

some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?

Greetings,
Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Nicolas Malin (Jira)

    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272860#comment-13272860 ]

Adrian Crum commented on OFBIZ-4870:
------------------------------------

I don't understand how one Delegator object per data source is a performance issue. I believe the real issue is multi-threading.

A single object can scale well if it is designed properly.

               

> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Nicolas Malin (Jira)
In reply to this post by Nicolas Malin (Jira)

    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ]

Adam Heath commented on OFBIZ-4870:
-----------------------------------

There is multi-threading per-datasource during startup.  Multiple threads creating the tables/keys/etc.  Seed data(from xml) is still single-threaded.

I have a series of changes to fix this locally, but it's *really* complex.  Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread.  The parsing, however, can be done in a background thread, in parallel.  Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around.  Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr).

What threaded stuff are you still seeking?
               

> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator

Nicolas Malin (Jira)
In reply to this post by Nicolas Malin (Jira)

    [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272868#comment-13272868 ]

Adrian Crum commented on OFBIZ-4870:
------------------------------------

Adam, I know your question is for the reporter, but I would like for us to discuss the SEDA approach again - but on the dev mailing list and not here.


               

> Multithreading in GenericDAO / Delegator
> ----------------------------------------
>
>                 Key: OFBIZ-4870
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>             Project: OFBiz
>          Issue Type: Improvement
>            Reporter: Mirko Vogelsmeier
>
> Hey there,
> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
> Greetings,
> Mirko

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

OFBiz multi-threading (was: [jira] [Commented] (OFBIZ-4870) Multithreading in GenericDAO / Delegator)

Adrian Crum-3
In reply to this post by Nicolas Malin (Jira)
I wanted to discuss this again.

Some time ago I modified my local copy of OFBiz to execute the the ant
run-install task faster by using multi-threaded entity creation and data
loading. I don't remember what i was working on at the time, but I
needed the process to run much faster.

Adam and I discussed my design and he said it sounded like it was
similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/). Actually,
my approach was not that sophisticated, I just used a consumer-provider
design pattern based on a FIFO queue.

Anyway, based on that conversation, Adam committed the multi-threaded
entity creation/data loading code
(http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E).
The work Adam committed greatly improved the database creation/data
loading time.

In this Jira issue, Adam mentions how transactions are tied to a single
thread. This is due to a fundamental weakness in OFBiz - the use of
ThreadLocal variables. In order to truly remove the bottlenecks in
OFBiz, we need to avoid the use of ThreadLocal variables - because they
prohibit handing off tasks to other threads. The Execution Context
proposed by David Jones some time ago is a step in the right direction,
but it uses ThreadLocal variables too.

So, we really need an object that is passed around the framework that
represents a TASK state, not a THREAD state - so that tasks can be
handed off to multiple threads. Transactions are tasks, so transaction
state would be contained in the object, not in ThreadLocal variables. I
believe that approach would solve the issue here.

-Adrian


On 5/10/2012 11:48 PM, Adam Heath (JIRA) wrote:

>      [ https://issues.apache.org/jira/browse/OFBIZ-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272862#comment-13272862 ]
>
> Adam Heath commented on OFBIZ-4870:
> -----------------------------------
>
> There is multi-threading per-datasource during startup.  Multiple threads creating the tables/keys/etc.  Seed data(from xml) is still single-threaded.
>
> I have a series of changes to fix this locally, but it's *really* complex.  Since an xml import may already be part of an *existing* transaction, I absolutely *must* manipulate the database in the current thread.  The parsing, however, can be done in a background thread, in parallel.  Unfortunately, the xml parser in use is event based, in the wrong direction(push instead of pull), so the bulk of the change is flipping that around.  Tbh, the complexity wasn't really worth the speedup(just a few percentage points, afaicr).
>
> What threaded stuff are you still seeking?
>
>> Multithreading in GenericDAO / Delegator
>> ----------------------------------------
>>
>>                  Key: OFBIZ-4870
>>                  URL: https://issues.apache.org/jira/browse/OFBIZ-4870
>>              Project: OFBiz
>>           Issue Type: Improvement
>>             Reporter: Mirko Vogelsmeier
>>
>> Hey there,
>> some time ago there were some commits of Adam that brought in first ideas of multi threaded delegator useage (r1139700).
>> Depending on how intense the data useage or data size is, there are performance issues we cannot scale by pure hardware and/or configuration as there is just one Delegator object per datasource.
>> I wanted to check on the progress of this very helpfull feature. Are there any further thoughts to work on this?
>> Greetings,
>> Mirko
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
Reply | Threaded
Open this post in threaded view
|

Re: OFBiz multi-threading

Adam Heath-2
On 05/10/2012 06:40 PM, Adrian Crum wrote:

> I wanted to discuss this again.
>
> Some time ago I modified my local copy of OFBiz to execute the the ant
> run-install task faster by using multi-threaded entity creation and
> data loading. I don't remember what i was working on at the time, but
> I needed the process to run much faster.
>
> Adam and I discussed my design and he said it sounded like it was
> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
> Actually, my approach was not that sophisticated, I just used a
> consumer-provider design pattern based on a FIFO queue.
>
> Anyway, based on that conversation, Adam committed the multi-threaded
> entity creation/data loading code
> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E).
> The work Adam committed greatly improved the database creation/data
> loading time.
>
> In this Jira issue, Adam mentions how transactions are tied to a
> single thread. This is due to a fundamental weakness in OFBiz - the
> use of ThreadLocal variables. In order to truly remove the bottlenecks
> in OFBiz, we need to avoid the use of ThreadLocal variables - because
> they prohibit handing off tasks to other threads. The Execution
> Context proposed by David Jones some time ago is a step in the right
> direction, but it uses ThreadLocal variables too.

My statement has *nothing* to do with ThreadLocal(well, at least not
used by ofbiz).  javax.transaction is for the *current* thread.  When
an xml import is requested, it *must* be done in the current thread,
in the current transaction.  It's not possible to suspend the
transaction for the foreground thread, and resume it in the
background(I tried).

I may not have explained myself fully in the jira issue, or maybe you
didn't understand.  In any case, the rest of your explanation seems to
not apply now.

> So, we really need an object that is passed around the framework that
> represents a TASK state, not a THREAD state - so that tasks can be
> handed off to multiple threads. Transactions are tasks, so transaction
> state would be contained in the object, not in ThreadLocal variables.
> I believe that approach would solve the issue here.
Reply | Threaded
Open this post in threaded view
|

Re: OFBiz multi-threading

Adrian Crum-3
On 5/11/2012 12:45 AM, Adam Heath wrote:

> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>> I wanted to discuss this again.
>>
>> Some time ago I modified my local copy of OFBiz to execute the the ant
>> run-install task faster by using multi-threaded entity creation and
>> data loading. I don't remember what i was working on at the time, but
>> I needed the process to run much faster.
>>
>> Adam and I discussed my design and he said it sounded like it was
>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>> Actually, my approach was not that sophisticated, I just used a
>> consumer-provider design pattern based on a FIFO queue.
>>
>> Anyway, based on that conversation, Adam committed the multi-threaded
>> entity creation/data loading code
>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E).
>> The work Adam committed greatly improved the database creation/data
>> loading time.
>>
>> In this Jira issue, Adam mentions how transactions are tied to a
>> single thread. This is due to a fundamental weakness in OFBiz - the
>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>> they prohibit handing off tasks to other threads. The Execution
>> Context proposed by David Jones some time ago is a step in the right
>> direction, but it uses ThreadLocal variables too.
> My statement has *nothing* to do with ThreadLocal(well, at least not
> used by ofbiz).  javax.transaction is for the *current* thread.  When
> an xml import is requested, it *must* be done in the current thread,
> in the current transaction.  It's not possible to suspend the
> transaction for the foreground thread, and resume it in the
> background(I tried).
>
> I may not have explained myself fully in the jira issue, or maybe you
> didn't understand.  In any case, the rest of your explanation seems to
> not apply now.

I did not understand that the transaction-to-thread relationship was
required in the javax.transaction API. That is a real problem. It makes
me wonder how SEDA-style designs can work in transaction-based applications.

-Adrian

Reply | Threaded
Open this post in threaded view
|

Re: OFBiz multi-threading

Adam Heath-2
On 05/10/2012 06:58 PM, Adrian Crum wrote:

> On 5/11/2012 12:45 AM, Adam Heath wrote:
>> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>>> I wanted to discuss this again.
>>>
>>> Some time ago I modified my local copy of OFBiz to execute the the ant
>>> run-install task faster by using multi-threaded entity creation and
>>> data loading. I don't remember what i was working on at the time, but
>>> I needed the process to run much faster.
>>>
>>> Adam and I discussed my design and he said it sounded like it was
>>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>>> Actually, my approach was not that sophisticated, I just used a
>>> consumer-provider design pattern based on a FIFO queue.
>>>
>>> Anyway, based on that conversation, Adam committed the multi-threaded
>>> entity creation/data loading code
>>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E).
>>>
>>> The work Adam committed greatly improved the database creation/data
>>> loading time.
>>>
>>> In this Jira issue, Adam mentions how transactions are tied to a
>>> single thread. This is due to a fundamental weakness in OFBiz - the
>>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>>> they prohibit handing off tasks to other threads. The Execution
>>> Context proposed by David Jones some time ago is a step in the right
>>> direction, but it uses ThreadLocal variables too.
>> My statement has *nothing* to do with ThreadLocal(well, at least not
>> used by ofbiz). javax.transaction is for the *current* thread. When
>> an xml import is requested, it *must* be done in the current thread,
>> in the current transaction. It's not possible to suspend the
>> transaction for the foreground thread, and resume it in the
>> background(I tried).
>>
>> I may not have explained myself fully in the jira issue, or maybe you
>> didn't understand. In any case, the rest of your explanation seems to
>> not apply now.
>
> I did not understand that the transaction-to-thread relationship was
> required in the javax.transaction API. That is a real problem. It makes
> me wonder how SEDA-style designs can work in transaction-based
> applications.

In this case, you also can't split each row in the xml into a separate
threead(transaction deadlock), nor into batches in separate
threads(postgresql has delayed foreign key resolution).

In a more general sense, javax.transaction has resource callbacks that
happen at various transaction states.  Those have to run in the foreground.

In a more general sense, the foreground thread made have added callbacks
to the general purpose container.  Said container needs to be careful
about which parts it puts into the background.  For instance, UtilCache
has support for listeners when things get added/removed.  Those also
should be run in the foreground.

The only parts that can be pushed onto other work systems must be
*completely* controlled.  *No* abstact classes or interfaces, or
otherwise the control code can't guarantee what is happening.  This
generalization is the same as the locking requirements talked about by
Java Concurrency in Practice.

If you really want to improve threading in ofbiz, a good first start
would be removing 'synchronized' from places.  But that is tricky if you
haven't done it before.  I've got several commits locally that remove
synchronized from lots of places, but need time to finish them off.
Reply | Threaded
Open this post in threaded view
|

Re: OFBiz multi-threading

Adrian Crum-3
On 5/11/2012 1:28 AM, Adam Heath wrote:

> On 05/10/2012 06:58 PM, Adrian Crum wrote:
>> On 5/11/2012 12:45 AM, Adam Heath wrote:
>>> On 05/10/2012 06:40 PM, Adrian Crum wrote:
>>>> I wanted to discuss this again.
>>>>
>>>> Some time ago I modified my local copy of OFBiz to execute the the ant
>>>> run-install task faster by using multi-threaded entity creation and
>>>> data loading. I don't remember what i was working on at the time, but
>>>> I needed the process to run much faster.
>>>>
>>>> Adam and I discussed my design and he said it sounded like it was
>>>> similar to SEDA (http://www.eecs.harvard.edu/~mdw/proj/seda/).
>>>> Actually, my approach was not that sophisticated, I just used a
>>>> consumer-provider design pattern based on a FIFO queue.
>>>>
>>>> Anyway, based on that conversation, Adam committed the multi-threaded
>>>> entity creation/data loading code
>>>> (http://mail-archives.apache.org/mod_mbox/ofbiz-commits/201106.mbox/%3C20110626025049.F302E2388B45@...%3E).
>>>>
>>>>
>>>> The work Adam committed greatly improved the database creation/data
>>>> loading time.
>>>>
>>>> In this Jira issue, Adam mentions how transactions are tied to a
>>>> single thread. This is due to a fundamental weakness in OFBiz - the
>>>> use of ThreadLocal variables. In order to truly remove the bottlenecks
>>>> in OFBiz, we need to avoid the use of ThreadLocal variables - because
>>>> they prohibit handing off tasks to other threads. The Execution
>>>> Context proposed by David Jones some time ago is a step in the right
>>>> direction, but it uses ThreadLocal variables too.
>>> My statement has *nothing* to do with ThreadLocal(well, at least not
>>> used by ofbiz). javax.transaction is for the *current* thread. When
>>> an xml import is requested, it *must* be done in the current thread,
>>> in the current transaction. It's not possible to suspend the
>>> transaction for the foreground thread, and resume it in the
>>> background(I tried).
>>>
>>> I may not have explained myself fully in the jira issue, or maybe you
>>> didn't understand. In any case, the rest of your explanation seems to
>>> not apply now.
>>
>> I did not understand that the transaction-to-thread relationship was
>> required in the javax.transaction API. That is a real problem. It makes
>> me wonder how SEDA-style designs can work in transaction-based
>> applications.
>
> In this case, you also can't split each row in the xml into a separate
> threead(transaction deadlock), nor into batches in separate
> threads(postgresql has delayed foreign key resolution).
>
> In a more general sense, javax.transaction has resource callbacks that
> happen at various transaction states.  Those have to run in the
> foreground.
>
> In a more general sense, the foreground thread made have added
> callbacks to the general purpose container.  Said container needs to
> be careful about which parts it puts into the background.  For
> instance, UtilCache has support for listeners when things get
> added/removed.  Those also should be run in the foreground.
>
> The only parts that can be pushed onto other work systems must be
> *completely* controlled.  *No* abstact classes or interfaces, or
> otherwise the control code can't guarantee what is happening.  This
> generalization is the same as the locking requirements talked about by
> Java Concurrency in Practice.
>
> If you really want to improve threading in ofbiz, a good first start
> would be removing 'synchronized' from places.  But that is tricky if
> you haven't done it before.  I've got several commits locally that
> remove synchronized from lots of places, but need time to finish them
> off.

Btw, I wasn't trying to improve the threading, I was trying to improve
responsiveness under heavy load. The feedback loop in SEDA prevents a
user from encountering a request that seems to lock-up or timeout due to
a busy server.

-Adrian