Hello Everyone,
I have an ofbiz instance in production where none of the jobs are being performed. I have about 160K jobs in pending status, but they are never being schedule. I can see the following in the log: 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ JobManager.java:201:ERROR] ---- exception report ---------------------------------------------------------- Transaction error trying to commit when polling and updating the JobSandbox: org.ofbiz.entity.transaction.GenericTransactionException: Roll back error (with no rollbackOnly cause found), could not commit transaction, was rolled back instead: javax.transaction.RollbackException: Transaction timeout (Transaction timeout) Exception: org.ofbiz.entity.transaction.GenericTransactionException Message: Roll back error (with no rollbackOnly cause found), could not commit transaction, was rolled back instead: javax.transaction.RollbackException: Transaction timeout (Transaction timeout) ---- cause --------------------------------------------------------------------- Exception: javax.transaction.RollbackException Message: Transaction timeout ---- stack trace --------------------------------------------------------------- javax.transaction.RollbackException: Transaction timeout org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) org.ofbiz.service.job.JobManager.poll(JobManager.java:197) org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) java.lang.Thread.run(Thread.java:619) -------------------------------------------------------------------------------- I believe that the JobManager is not being able to handle all those jobs to schedule them, so nothing is being scheduled, which of course make the job list longer. Can anyone think of how to make the jobs run? All help much appreciated, -- Josh. |
the key is Transaction timeout
this could be the job length could be the database connection please specify the version of ofbiz since earlier transaction problems were taken care of by changing code that deals with transactions. Josh Jacobson sent the following on 7/13/2011 11:48 AM: > Hello Everyone, > > I have an ofbiz instance in production where none of the jobs are > being performed. I have about 160K jobs in pending status, but they > are never being schedule. > I can see the following in the log: > > 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ > JobManager.java:201:ERROR] ---- exception report > ---------------------------------------------------------- Transaction > error trying to commit when polling and updating the JobSandbox: > org.ofbiz.entity.transaction.GenericTransactionException: Roll back > error (with no rollbackOnly cause found), could not commit > transaction, was rolled back instead: > javax.transaction.RollbackException: Transaction timeout (Transaction > timeout) Exception: > org.ofbiz.entity.transaction.GenericTransactionException Message: Roll > back error (with no rollbackOnly cause found), could not commit > transaction, was rolled back instead: > javax.transaction.RollbackException: Transaction timeout (Transaction > timeout) ---- cause > --------------------------------------------------------------------- > Exception: javax.transaction.RollbackException Message: Transaction > timeout ---- stack trace > --------------------------------------------------------------- > javax.transaction.RollbackException: Transaction timeout > org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) > org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) > org.ofbiz.service.job.JobManager.poll(JobManager.java:197) > org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) > java.lang.Thread.run(Thread.java:619) > -------------------------------------------------------------------------------- > > I believe that the JobManager is not being able to handle all those > jobs to schedule them, so nothing is being scheduled, which of course > make the job list longer. > > Can anyone think of how to make the jobs run? > > All help much appreciated, > |
In reply to this post by Josh Jacobson
Josh,
I've also seen this problem if the JobSandbox table has too many rows to process. I ran into a similar problem when I tried to run 10,000 Async batch processes. The time it took for the JobPoller to process all the records was too long and the transaction would time out. I had a patch to change the transaction timeout for the JobPoller specifically as it wasn't available in ofbiz at the time, but I don't think I ever submitted it. I could look for this patch if anyone is interested but it may already be implemented in the framework. I would try archiving jobs from the JobSandbox first. Brett On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson <[hidden email]>wrote: > Hello Everyone, > > I have an ofbiz instance in production where none of the jobs are > being performed. I have about 160K jobs in pending status, but they > are never being schedule. > I can see the following in the log: > > 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ > JobManager.java:201:ERROR] ---- exception report > ---------------------------------------------------------- Transaction > error trying to commit when polling and updating the JobSandbox: > org.ofbiz.entity.transaction.GenericTransactionException: Roll back > error (with no rollbackOnly cause found), could not commit > transaction, was rolled back instead: > javax.transaction.RollbackException: Transaction timeout (Transaction > timeout) Exception: > org.ofbiz.entity.transaction.GenericTransactionException Message: Roll > back error (with no rollbackOnly cause found), could not commit > transaction, was rolled back instead: > javax.transaction.RollbackException: Transaction timeout (Transaction > timeout) ---- cause > --------------------------------------------------------------------- > Exception: javax.transaction.RollbackException Message: Transaction > timeout ---- stack trace > --------------------------------------------------------------- > javax.transaction.RollbackException: Transaction timeout > > org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) > > org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) > > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) > > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) > org.ofbiz.service.job.JobManager.poll(JobManager.java:197) > org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) > java.lang.Thread.run(Thread.java:619) > > -------------------------------------------------------------------------------- > > I believe that the JobManager is not being able to handle all those > jobs to schedule them, so nothing is being scheduled, which of course > make the job list longer. > > Can anyone think of how to make the jobs run? > > All help much appreciated, > > -- > Josh. > |
In reply to this post by BJ Freeman
BJ,
I am running 10.04. On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: > the key is Transaction timeout > this could be the job length > could be the database connection > > please specify the version of ofbiz since earlier transaction problems > were taken care of by changing code that deals with transactions. > > Josh Jacobson sent the following on 7/13/2011 11:48 AM: >> Hello Everyone, >> >> I have an ofbiz instance in production where none of the jobs are >> being performed. I have about 160K jobs in pending status, but they >> are never being schedule. >> I can see the following in the log: >> >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >> JobManager.java:201:ERROR] ---- exception report >> ---------------------------------------------------------- Transaction >> error trying to commit when polling and updating the JobSandbox: >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >> error (with no rollbackOnly cause found), could not commit >> transaction, was rolled back instead: >> javax.transaction.RollbackException: Transaction timeout (Transaction >> timeout) Exception: >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >> back error (with no rollbackOnly cause found), could not commit >> transaction, was rolled back instead: >> javax.transaction.RollbackException: Transaction timeout (Transaction >> timeout) ---- cause >> --------------------------------------------------------------------- >> Exception: javax.transaction.RollbackException Message: Transaction >> timeout ---- stack trace >> --------------------------------------------------------------- >> javax.transaction.RollbackException: Transaction timeout >> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >> java.lang.Thread.run(Thread.java:619) >> -------------------------------------------------------------------------------- >> >> I believe that the JobManager is not being able to handle all those >> jobs to schedule them, so nothing is being scheduled, which of course >> make the job list longer. >> >> Can anyone think of how to make the jobs run? >> >> All help much appreciated, >> > |
In reply to this post by Brett
Brett,
Can you please explain what you mean by archiving the current JobSandbox first? Do you mean somehow removing the current pending jobs, applying you patch and the copying them back again? Thanks, On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> wrote: > Josh, > > I've also seen this problem if the JobSandbox table has too many rows to > process. I ran into a similar problem when I tried to run 10,000 Async > batch processes. The time it took for the JobPoller to process all the > records was too long and the transaction would time out. > > I had a patch to change the transaction timeout for the JobPoller > specifically as it wasn't available in ofbiz at the time, but I don't think > I ever submitted it. I could look for this patch if anyone is interested > but it may already be implemented in the framework. > > I would try archiving jobs from the JobSandbox first. > > > Brett > > On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson > <[hidden email]>wrote: > >> Hello Everyone, >> >> I have an ofbiz instance in production where none of the jobs are >> being performed. I have about 160K jobs in pending status, but they >> are never being schedule. >> I can see the following in the log: >> >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >> JobManager.java:201:ERROR] ---- exception report >> ---------------------------------------------------------- Transaction >> error trying to commit when polling and updating the JobSandbox: >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >> error (with no rollbackOnly cause found), could not commit >> transaction, was rolled back instead: >> javax.transaction.RollbackException: Transaction timeout (Transaction >> timeout) Exception: >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >> back error (with no rollbackOnly cause found), could not commit >> transaction, was rolled back instead: >> javax.transaction.RollbackException: Transaction timeout (Transaction >> timeout) ---- cause >> --------------------------------------------------------------------- >> Exception: javax.transaction.RollbackException Message: Transaction >> timeout ---- stack trace >> --------------------------------------------------------------- >> javax.transaction.RollbackException: Transaction timeout >> >> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >> >> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >> >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >> >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >> java.lang.Thread.run(Thread.java:619) >> >> -------------------------------------------------------------------------------- >> >> I believe that the JobManager is not being able to handle all those >> jobs to schedule them, so nothing is being scheduled, which of course >> make the job list longer. >> >> Can anyone think of how to make the jobs run? >> >> All help much appreciated, >> >> -- >> Josh. >> > |
In reply to this post by Josh Jacobson
Ok so you have the latest code.
what is the eviorment you working with. OS Memory CPU speed Josh Jacobson sent the following on 7/13/2011 12:12 PM: > BJ, > > I am running 10.04. > > On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >> the key is Transaction timeout >> this could be the job length >> could be the database connection >> >> please specify the version of ofbiz since earlier transaction problems >> were taken care of by changing code that deals with transactions. >> >> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>> Hello Everyone, >>> >>> I have an ofbiz instance in production where none of the jobs are >>> being performed. I have about 160K jobs in pending status, but they >>> are never being schedule. >>> I can see the following in the log: >>> >>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>> JobManager.java:201:ERROR] ---- exception report >>> ---------------------------------------------------------- Transaction >>> error trying to commit when polling and updating the JobSandbox: >>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>> error (with no rollbackOnly cause found), could not commit >>> transaction, was rolled back instead: >>> javax.transaction.RollbackException: Transaction timeout (Transaction >>> timeout) Exception: >>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>> back error (with no rollbackOnly cause found), could not commit >>> transaction, was rolled back instead: >>> javax.transaction.RollbackException: Transaction timeout (Transaction >>> timeout) ---- cause >>> --------------------------------------------------------------------- >>> Exception: javax.transaction.RollbackException Message: Transaction >>> timeout ---- stack trace >>> --------------------------------------------------------------- >>> javax.transaction.RollbackException: Transaction timeout >>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>> java.lang.Thread.run(Thread.java:619) >>> -------------------------------------------------------------------------------- >>> >>> I believe that the JobManager is not being able to handle all those >>> jobs to schedule them, so nothing is being scheduled, which of course >>> make the job list longer. >>> >>> Can anyone think of how to make the jobs run? >>> >>> All help much appreciated, >>> >> > |
In reply to this post by Josh Jacobson
I meant removing finished jobs. If you have thousands of pending jobs then
you will have the same problem I mentioned in my first email. One resolution will be to increase the job poller transaction time. In the ofbiz version I was using there was not a way to configure the poller transaction time. It just used the default time. I had to create a patch to allow this to happen. In the patch you had to be careful to not increase the transaction time greater than the frequency of the job poller. Otherwise you get into a lock situation where one job poller is still running within a transaction and another poller starts. This didn't create a huge problem but the second job poller would usually lock and then time out. Brett On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: > Brett, > > Can you please explain what you mean by archiving the current JobSandbox > first? > Do you mean somehow removing the current pending jobs, applying you > patch and the copying them back again? > > Thanks, > > > On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> > wrote: > > Josh, > > > > I've also seen this problem if the JobSandbox table has too many rows to > > process. I ran into a similar problem when I tried to run 10,000 Async > > batch processes. The time it took for the JobPoller to process all the > > records was too long and the transaction would time out. > > > > I had a patch to change the transaction timeout for the JobPoller > > specifically as it wasn't available in ofbiz at the time, but I don't > think > > I ever submitted it. I could look for this patch if anyone is interested > > but it may already be implemented in the framework. > > > > I would try archiving jobs from the JobSandbox first. > > > > > > Brett > > > > On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson > > <[hidden email]>wrote: > > > >> Hello Everyone, > >> > >> I have an ofbiz instance in production where none of the jobs are > >> being performed. I have about 160K jobs in pending status, but they > >> are never being schedule. > >> I can see the following in the log: > >> > >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ > >> JobManager.java:201:ERROR] ---- exception report > >> ---------------------------------------------------------- Transaction > >> error trying to commit when polling and updating the JobSandbox: > >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back > >> error (with no rollbackOnly cause found), could not commit > >> transaction, was rolled back instead: > >> javax.transaction.RollbackException: Transaction timeout (Transaction > >> timeout) Exception: > >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll > >> back error (with no rollbackOnly cause found), could not commit > >> transaction, was rolled back instead: > >> javax.transaction.RollbackException: Transaction timeout (Transaction > >> timeout) ---- cause > >> --------------------------------------------------------------------- > >> Exception: javax.transaction.RollbackException Message: Transaction > >> timeout ---- stack trace > >> --------------------------------------------------------------- > >> javax.transaction.RollbackException: Transaction timeout > >> > >> > org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) > >> > >> > org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) > >> > >> > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) > >> > >> > org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) > >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) > >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) > >> java.lang.Thread.run(Thread.java:619) > >> > >> > -------------------------------------------------------------------------------- > >> > >> I believe that the JobManager is not being able to handle all those > >> jobs to schedule them, so nothing is being scheduled, which of course > >> make the job list longer. > >> > >> Can anyone think of how to make the jobs run? > >> > >> All help much appreciated, > >> > >> -- > >> Josh. > >> > > > |
In reply to this post by BJ Freeman
Currently I am running:
Red Hat Enterprise Linux Server release 5.5 6 CPUs, 16384MB RAM It was very recently upgraded from 2 CPUs and 8GB of RAM because we were having performance issues (lots of swap memory being used). It's on one of those cloud servers. Now it's running without using any swap. On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote: > Ok so you have the latest code. > what is the eviorment you working with. > OS > Memory > CPU speed > > Josh Jacobson sent the following on 7/13/2011 12:12 PM: >> BJ, >> >> I am running 10.04. >> >> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >>> the key is Transaction timeout >>> this could be the job length >>> could be the database connection >>> >>> please specify the version of ofbiz since earlier transaction problems >>> were taken care of by changing code that deals with transactions. >>> >>> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>>> Hello Everyone, >>>> >>>> I have an ofbiz instance in production where none of the jobs are >>>> being performed. I have about 160K jobs in pending status, but they >>>> are never being schedule. >>>> I can see the following in the log: >>>> >>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>> JobManager.java:201:ERROR] ---- exception report >>>> ---------------------------------------------------------- Transaction >>>> error trying to commit when polling and updating the JobSandbox: >>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>> error (with no rollbackOnly cause found), could not commit >>>> transaction, was rolled back instead: >>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>> timeout) Exception: >>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>> back error (with no rollbackOnly cause found), could not commit >>>> transaction, was rolled back instead: >>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>> timeout) ---- cause >>>> --------------------------------------------------------------------- >>>> Exception: javax.transaction.RollbackException Message: Transaction >>>> timeout ---- stack trace >>>> --------------------------------------------------------------- >>>> javax.transaction.RollbackException: Transaction timeout >>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>> java.lang.Thread.run(Thread.java:619) >>>> -------------------------------------------------------------------------------- >>>> >>>> I believe that the JobManager is not being able to handle all those >>>> jobs to schedule them, so nothing is being scheduled, which of course >>>> make the job list longer. >>>> >>>> Can anyone think of how to make the jobs run? >>>> >>>> All help much appreciated, >>>> >>> >> > |
In reply to this post by Brett
On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
> I meant removing finished jobs. If you have thousands of pending jobs then > you will have the same problem I mentioned in my first email. One > resolution will be to increase the job poller transaction time. In the > ofbiz version I was using there was not a way to configure the poller > transaction time. It just used the default time. I had to create a patch > to allow this to happen. I see. I already did that: We had 2.6 million lines on the JobSandbox, mostly of completed or failed jobs. We deleted completed and failed and are now looking at about 260L pending jobs. I want to run those jobs, so I can get the machine back to normal. > > In the patch you had to be careful to not increase the transaction time > greater than the frequency of the job poller. Otherwise you get into a lock > situation where one job poller is still running within a transaction and > another poller starts. This didn't create a huge problem but the second job > poller would usually lock and then time out. I understand the possible race condition. So how do I figure what to set the timeout to and where do I configure that? Thanks, -- Josh. |
On Wed, Jul 13, 2011 at 12:51 PM, Josh Jacobson
<[hidden email]> wrote: > On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >> I meant removing finished jobs. If you have thousands of pending jobs then >> you will have the same problem I mentioned in my first email. One >> resolution will be to increase the job poller transaction time. In the >> ofbiz version I was using there was not a way to configure the poller >> transaction time. It just used the default time. I had to create a patch >> to allow this to happen. > > I see. I already did that: We had 2.6 million lines on the JobSandbox, > mostly of completed or failed jobs. We deleted completed and failed > and are now looking at about 260L pending jobs. I want to run those > jobs, so I can get the machine back to normal. Sorry, just noticed the typo: We currently have 260K + jobs as pending and I want to process them to get things back to normal. Thanks for the help, -- Josh. |
In reply to this post by Josh Jacobson
You now know why I don't recommend cloud configuration for realtime
operations, unless your running over dedicate lines not part of the internet. to summarize you environment caused the problem not ofbiz Now you have jobs cued that should have been run but have piled up. you need a way to get the job run so they don;t time out the system. I recommend you look at the purge old jobs service, copy and modify it to run your jobs, maybe by time group. Josh Jacobson sent the following on 7/13/2011 12:48 PM: > Currently I am running: > > Red Hat Enterprise Linux Server release 5.5 > 6 CPUs, 16384MB RAM > > It was very recently upgraded from 2 CPUs and 8GB of RAM because we > were having performance issues (lots of swap memory being used). It's > on one of those cloud servers. Now it's running without using any > swap. > > On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote: >> Ok so you have the latest code. >> what is the eviorment you working with. >> OS >> Memory >> CPU speed >> >> Josh Jacobson sent the following on 7/13/2011 12:12 PM: >>> BJ, >>> >>> I am running 10.04. >>> >>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >>>> the key is Transaction timeout >>>> this could be the job length >>>> could be the database connection >>>> >>>> please specify the version of ofbiz since earlier transaction problems >>>> were taken care of by changing code that deals with transactions. >>>> >>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>>>> Hello Everyone, >>>>> >>>>> I have an ofbiz instance in production where none of the jobs are >>>>> being performed. I have about 160K jobs in pending status, but they >>>>> are never being schedule. >>>>> I can see the following in the log: >>>>> >>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>> JobManager.java:201:ERROR] ---- exception report >>>>> ---------------------------------------------------------- Transaction >>>>> error trying to commit when polling and updating the JobSandbox: >>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>> error (with no rollbackOnly cause found), could not commit >>>>> transaction, was rolled back instead: >>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>> timeout) Exception: >>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>> back error (with no rollbackOnly cause found), could not commit >>>>> transaction, was rolled back instead: >>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>> timeout) ---- cause >>>>> --------------------------------------------------------------------- >>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>> timeout ---- stack trace >>>>> --------------------------------------------------------------- >>>>> javax.transaction.RollbackException: Transaction timeout >>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>> java.lang.Thread.run(Thread.java:619) >>>>> -------------------------------------------------------------------------------- >>>>> >>>>> I believe that the JobManager is not being able to handle all those >>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>> make the job list longer. >>>>> >>>>> Can anyone think of how to make the jobs run? >>>>> >>>>> All help much appreciated, >>>>> >>>> >>> >> > |
Thanks for the pointers. I'll take a look.
There is one more piece of information: The purgeOldJobs service is in a "crashed" status. Do you think that is significant? Thanks, On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote: > You now know why I don't recommend cloud configuration for realtime > operations, unless your running over dedicate lines not part of the > internet. > to summarize you environment caused the problem not ofbiz > Now you have jobs cued that should have been run but have piled up. > you need a way to get the job run so they don;t time out the system. > I recommend you look at the purge old jobs service, copy and modify it > to run your jobs, maybe by time group. > > Josh Jacobson sent the following on 7/13/2011 12:48 PM: >> Currently I am running: >> >> Red Hat Enterprise Linux Server release 5.5 >> 6 CPUs, 16384MB RAM >> >> It was very recently upgraded from 2 CPUs and 8GB of RAM because we >> were having performance issues (lots of swap memory being used). It's >> on one of those cloud servers. Now it's running without using any >> swap. >> >> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote: >>> Ok so you have the latest code. >>> what is the eviorment you working with. >>> OS >>> Memory >>> CPU speed >>> >>> Josh Jacobson sent the following on 7/13/2011 12:12 PM: >>>> BJ, >>>> >>>> I am running 10.04. >>>> >>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >>>>> the key is Transaction timeout >>>>> this could be the job length >>>>> could be the database connection >>>>> >>>>> please specify the version of ofbiz since earlier transaction problems >>>>> were taken care of by changing code that deals with transactions. >>>>> >>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>>>>> Hello Everyone, >>>>>> >>>>>> I have an ofbiz instance in production where none of the jobs are >>>>>> being performed. I have about 160K jobs in pending status, but they >>>>>> are never being schedule. >>>>>> I can see the following in the log: >>>>>> >>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>>> JobManager.java:201:ERROR] ---- exception report >>>>>> ---------------------------------------------------------- Transaction >>>>>> error trying to commit when polling and updating the JobSandbox: >>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>>> error (with no rollbackOnly cause found), could not commit >>>>>> transaction, was rolled back instead: >>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>> timeout) Exception: >>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>>> back error (with no rollbackOnly cause found), could not commit >>>>>> transaction, was rolled back instead: >>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>> timeout) ---- cause >>>>>> --------------------------------------------------------------------- >>>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>>> timeout ---- stack trace >>>>>> --------------------------------------------------------------- >>>>>> javax.transaction.RollbackException: Transaction timeout >>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>>> java.lang.Thread.run(Thread.java:619) >>>>>> -------------------------------------------------------------------------------- >>>>>> >>>>>> I believe that the JobManager is not being able to handle all those >>>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>>> make the job list longer. >>>>>> >>>>>> Can anyone think of how to make the jobs run? >>>>>> >>>>>> All help much appreciated, >>>>>> >>>>> >>>> >>> >> > |
it means it will not purge job done so you will get a build up
you can do a run service to start it again Josh Jacobson sent the following on 7/13/2011 4:41 PM: > Thanks for the pointers. I'll take a look. > > There is one more piece of information: The purgeOldJobs service is in > a "crashed" status. Do you think that is significant? > > Thanks, > > On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote: >> You now know why I don't recommend cloud configuration for realtime >> operations, unless your running over dedicate lines not part of the >> internet. >> to summarize you environment caused the problem not ofbiz >> Now you have jobs cued that should have been run but have piled up. >> you need a way to get the job run so they don;t time out the system. >> I recommend you look at the purge old jobs service, copy and modify it >> to run your jobs, maybe by time group. >> >> Josh Jacobson sent the following on 7/13/2011 12:48 PM: >>> Currently I am running: >>> >>> Red Hat Enterprise Linux Server release 5.5 >>> 6 CPUs, 16384MB RAM >>> >>> It was very recently upgraded from 2 CPUs and 8GB of RAM because we >>> were having performance issues (lots of swap memory being used). It's >>> on one of those cloud servers. Now it's running without using any >>> swap. >>> >>> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote: >>>> Ok so you have the latest code. >>>> what is the eviorment you working with. >>>> OS >>>> Memory >>>> CPU speed >>>> >>>> Josh Jacobson sent the following on 7/13/2011 12:12 PM: >>>>> BJ, >>>>> >>>>> I am running 10.04. >>>>> >>>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >>>>>> the key is Transaction timeout >>>>>> this could be the job length >>>>>> could be the database connection >>>>>> >>>>>> please specify the version of ofbiz since earlier transaction problems >>>>>> were taken care of by changing code that deals with transactions. >>>>>> >>>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>>>>>> Hello Everyone, >>>>>>> >>>>>>> I have an ofbiz instance in production where none of the jobs are >>>>>>> being performed. I have about 160K jobs in pending status, but they >>>>>>> are never being schedule. >>>>>>> I can see the following in the log: >>>>>>> >>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>>>> JobManager.java:201:ERROR] ---- exception report >>>>>>> ---------------------------------------------------------- Transaction >>>>>>> error trying to commit when polling and updating the JobSandbox: >>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>>>> error (with no rollbackOnly cause found), could not commit >>>>>>> transaction, was rolled back instead: >>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>> timeout) Exception: >>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>>>> back error (with no rollbackOnly cause found), could not commit >>>>>>> transaction, was rolled back instead: >>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>> timeout) ---- cause >>>>>>> --------------------------------------------------------------------- >>>>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>>>> timeout ---- stack trace >>>>>>> --------------------------------------------------------------- >>>>>>> javax.transaction.RollbackException: Transaction timeout >>>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>>>> java.lang.Thread.run(Thread.java:619) >>>>>>> -------------------------------------------------------------------------------- >>>>>>> >>>>>>> I believe that the JobManager is not being able to handle all those >>>>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>>>> make the job list longer. >>>>>>> >>>>>>> Can anyone think of how to make the jobs run? >>>>>>> >>>>>>> All help much appreciated, >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
Thanks, that is what I figured. First things first though: I need to
get those jobs running somehow. Thanks for the help. On Wed, Jul 13, 2011 at 4:46 PM, BJ Freeman <[hidden email]> wrote: > it means it will not purge job done so you will get a build up > you can do a run service to start it again > > > Josh Jacobson sent the following on 7/13/2011 4:41 PM: >> Thanks for the pointers. I'll take a look. >> >> There is one more piece of information: The purgeOldJobs service is in >> a "crashed" status. Do you think that is significant? >> >> Thanks, >> >> On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote: >>> You now know why I don't recommend cloud configuration for realtime >>> operations, unless your running over dedicate lines not part of the >>> internet. >>> to summarize you environment caused the problem not ofbiz >>> Now you have jobs cued that should have been run but have piled up. >>> you need a way to get the job run so they don;t time out the system. >>> I recommend you look at the purge old jobs service, copy and modify it >>> to run your jobs, maybe by time group. >>> >>> Josh Jacobson sent the following on 7/13/2011 12:48 PM: >>>> Currently I am running: >>>> >>>> Red Hat Enterprise Linux Server release 5.5 >>>> 6 CPUs, 16384MB RAM >>>> >>>> It was very recently upgraded from 2 CPUs and 8GB of RAM because we >>>> were having performance issues (lots of swap memory being used). It's >>>> on one of those cloud servers. Now it's running without using any >>>> swap. >>>> >>>> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote: >>>>> Ok so you have the latest code. >>>>> what is the eviorment you working with. >>>>> OS >>>>> Memory >>>>> CPU speed >>>>> >>>>> Josh Jacobson sent the following on 7/13/2011 12:12 PM: >>>>>> BJ, >>>>>> >>>>>> I am running 10.04. >>>>>> >>>>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote: >>>>>>> the key is Transaction timeout >>>>>>> this could be the job length >>>>>>> could be the database connection >>>>>>> >>>>>>> please specify the version of ofbiz since earlier transaction problems >>>>>>> were taken care of by changing code that deals with transactions. >>>>>>> >>>>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM: >>>>>>>> Hello Everyone, >>>>>>>> >>>>>>>> I have an ofbiz instance in production where none of the jobs are >>>>>>>> being performed. I have about 160K jobs in pending status, but they >>>>>>>> are never being schedule. >>>>>>>> I can see the following in the log: >>>>>>>> >>>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>>>>> JobManager.java:201:ERROR] ---- exception report >>>>>>>> ---------------------------------------------------------- Transaction >>>>>>>> error trying to commit when polling and updating the JobSandbox: >>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>>>>> error (with no rollbackOnly cause found), could not commit >>>>>>>> transaction, was rolled back instead: >>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>>> timeout) Exception: >>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>>>>> back error (with no rollbackOnly cause found), could not commit >>>>>>>> transaction, was rolled back instead: >>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>>> timeout) ---- cause >>>>>>>> --------------------------------------------------------------------- >>>>>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>>>>> timeout ---- stack trace >>>>>>>> --------------------------------------------------------------- >>>>>>>> javax.transaction.RollbackException: Transaction timeout >>>>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>>>>> java.lang.Thread.run(Thread.java:619) >>>>>>>> -------------------------------------------------------------------------------- >>>>>>>> >>>>>>>> I believe that the JobManager is not being able to handle all those >>>>>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>>>>> make the job list longer. >>>>>>>> >>>>>>>> Can anyone think of how to make the jobs run? >>>>>>>> >>>>>>>> All help much appreciated, >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |
In reply to this post by Brett
Brett,
Before I start trying to run the jobs manually, I want to give your suggestion a try. I think I know where to configure the job polling transaction time (I believe it's the poll-db-millis="20000" value on the framework/service/config/serviceengine.xml. However, I still don't know what to increase it to. I understand that we wouldn't want to make it bigger than the default polling interval. Do you know what the default interval between polling is? Thanks, On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: > I meant removing finished jobs. If you have thousands of pending jobs then > you will have the same problem I mentioned in my first email. One > resolution will be to increase the job poller transaction time. In the > ofbiz version I was using there was not a way to configure the poller > transaction time. It just used the default time. I had to create a patch > to allow this to happen. > > In the patch you had to be careful to not increase the transaction time > greater than the frequency of the job poller. Otherwise you get into a lock > situation where one job poller is still running within a transaction and > another poller starts. This didn't create a huge problem but the second job > poller would usually lock and then time out. > > > > Brett > > > > On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: > >> Brett, >> >> Can you please explain what you mean by archiving the current JobSandbox >> first? >> Do you mean somehow removing the current pending jobs, applying you >> patch and the copying them back again? >> >> Thanks, >> >> >> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >> wrote: >> > Josh, >> > >> > I've also seen this problem if the JobSandbox table has too many rows to >> > process. I ran into a similar problem when I tried to run 10,000 Async >> > batch processes. The time it took for the JobPoller to process all the >> > records was too long and the transaction would time out. >> > >> > I had a patch to change the transaction timeout for the JobPoller >> > specifically as it wasn't available in ofbiz at the time, but I don't >> think >> > I ever submitted it. I could look for this patch if anyone is interested >> > but it may already be implemented in the framework. >> > >> > I would try archiving jobs from the JobSandbox first. >> > >> > >> > Brett >> > >> > On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson >> > <[hidden email]>wrote: >> > >> >> Hello Everyone, >> >> >> >> I have an ofbiz instance in production where none of the jobs are >> >> being performed. I have about 160K jobs in pending status, but they >> >> are never being schedule. >> >> I can see the following in the log: >> >> >> >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >> >> JobManager.java:201:ERROR] ---- exception report >> >> ---------------------------------------------------------- Transaction >> >> error trying to commit when polling and updating the JobSandbox: >> >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >> >> error (with no rollbackOnly cause found), could not commit >> >> transaction, was rolled back instead: >> >> javax.transaction.RollbackException: Transaction timeout (Transaction >> >> timeout) Exception: >> >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >> >> back error (with no rollbackOnly cause found), could not commit >> >> transaction, was rolled back instead: >> >> javax.transaction.RollbackException: Transaction timeout (Transaction >> >> timeout) ---- cause >> >> --------------------------------------------------------------------- >> >> Exception: javax.transaction.RollbackException Message: Transaction >> >> timeout ---- stack trace >> >> --------------------------------------------------------------- >> >> javax.transaction.RollbackException: Transaction timeout >> >> >> >> >> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >> >> >> >> >> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >> >> >> >> >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >> >> >> >> >> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >> >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >> >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >> >> java.lang.Thread.run(Thread.java:619) >> >> >> >> >> -------------------------------------------------------------------------------- >> >> >> >> I believe that the JobManager is not being able to handle all those >> >> jobs to schedule them, so nothing is being scheduled, which of course >> >> make the job list longer. >> >> >> >> Can anyone think of how to make the jobs run? >> >> >> >> All help much appreciated, >> >> >> >> -- >> >> Josh. >> >> >> > >> > |
That configuration is for the frequency of job polls. There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
JobManager.java (line 148): beganTransaction = TransactionUtil.begin(); needs to be changed to use TransactionUtil.begin(int) Regards Scott HotWax Media http://www.hotwaxmedia.com On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: > Brett, > > Before I start trying to run the jobs manually, I want to give your > suggestion a try. I think I know where to configure the job polling > transaction time (I believe it's the poll-db-millis="20000" value on > the framework/service/config/serviceengine.xml. > > However, I still don't know what to increase it to. I understand that > we wouldn't want to make it bigger than the default polling interval. > Do you know what the default interval between polling is? > > Thanks, > > On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >> I meant removing finished jobs. If you have thousands of pending jobs then >> you will have the same problem I mentioned in my first email. One >> resolution will be to increase the job poller transaction time. In the >> ofbiz version I was using there was not a way to configure the poller >> transaction time. It just used the default time. I had to create a patch >> to allow this to happen. >> >> In the patch you had to be careful to not increase the transaction time >> greater than the frequency of the job poller. Otherwise you get into a lock >> situation where one job poller is still running within a transaction and >> another poller starts. This didn't create a huge problem but the second job >> poller would usually lock and then time out. >> >> >> >> Brett >> >> >> >> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: >> >>> Brett, >>> >>> Can you please explain what you mean by archiving the current JobSandbox >>> first? >>> Do you mean somehow removing the current pending jobs, applying you >>> patch and the copying them back again? >>> >>> Thanks, >>> >>> >>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >>> wrote: >>>> Josh, >>>> >>>> I've also seen this problem if the JobSandbox table has too many rows to >>>> process. I ran into a similar problem when I tried to run 10,000 Async >>>> batch processes. The time it took for the JobPoller to process all the >>>> records was too long and the transaction would time out. >>>> >>>> I had a patch to change the transaction timeout for the JobPoller >>>> specifically as it wasn't available in ofbiz at the time, but I don't >>> think >>>> I ever submitted it. I could look for this patch if anyone is interested >>>> but it may already be implemented in the framework. >>>> >>>> I would try archiving jobs from the JobSandbox first. >>>> >>>> >>>> Brett >>>> >>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson >>>> <[hidden email]>wrote: >>>> >>>>> Hello Everyone, >>>>> >>>>> I have an ofbiz instance in production where none of the jobs are >>>>> being performed. I have about 160K jobs in pending status, but they >>>>> are never being schedule. >>>>> I can see the following in the log: >>>>> >>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>> JobManager.java:201:ERROR] ---- exception report >>>>> ---------------------------------------------------------- Transaction >>>>> error trying to commit when polling and updating the JobSandbox: >>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>> error (with no rollbackOnly cause found), could not commit >>>>> transaction, was rolled back instead: >>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>> timeout) Exception: >>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>> back error (with no rollbackOnly cause found), could not commit >>>>> transaction, was rolled back instead: >>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>> timeout) ---- cause >>>>> --------------------------------------------------------------------- >>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>> timeout ---- stack trace >>>>> --------------------------------------------------------------- >>>>> javax.transaction.RollbackException: Transaction timeout >>>>> >>>>> >>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>> >>>>> >>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>> >>>>> >>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>> >>>>> >>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>> java.lang.Thread.run(Thread.java:619) >>>>> >>>>> >>> -------------------------------------------------------------------------------- >>>>> >>>>> I believe that the JobManager is not being able to handle all those >>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>> make the job list longer. >>>>> >>>>> Can anyone think of how to make the jobs run? >>>>> >>>>> All help much appreciated, >>>>> >>>>> -- >>>>> Josh. >>>>> >>>> >>> >> smime.p7s (3K) Download Attachment |
Scott,
Thanks! That is very precise advise. Do you have a suggestion on interval time? 60 seconds? 120? Thanks, On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote: > That configuration is for the frequency of job polls. There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly: > JobManager.java (line 148): > beganTransaction = TransactionUtil.begin(); > needs to be changed to use TransactionUtil.begin(int) > > Regards > Scott > > HotWax Media > http://www.hotwaxmedia.com > > On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: > >> Brett, >> >> Before I start trying to run the jobs manually, I want to give your >> suggestion a try. I think I know where to configure the job polling >> transaction time (I believe it's the poll-db-millis="20000" value on >> the framework/service/config/serviceengine.xml. >> >> However, I still don't know what to increase it to. I understand that >> we wouldn't want to make it bigger than the default polling interval. >> Do you know what the default interval between polling is? >> >> Thanks, >> >> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >>> I meant removing finished jobs. If you have thousands of pending jobs then >>> you will have the same problem I mentioned in my first email. One >>> resolution will be to increase the job poller transaction time. In the >>> ofbiz version I was using there was not a way to configure the poller >>> transaction time. It just used the default time. I had to create a patch >>> to allow this to happen. >>> >>> In the patch you had to be careful to not increase the transaction time >>> greater than the frequency of the job poller. Otherwise you get into a lock >>> situation where one job poller is still running within a transaction and >>> another poller starts. This didn't create a huge problem but the second job >>> poller would usually lock and then time out. >>> >>> >>> >>> Brett >>> >>> >>> >>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: >>> >>>> Brett, >>>> >>>> Can you please explain what you mean by archiving the current JobSandbox >>>> first? >>>> Do you mean somehow removing the current pending jobs, applying you >>>> patch and the copying them back again? >>>> >>>> Thanks, >>>> >>>> >>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >>>> wrote: >>>>> Josh, >>>>> >>>>> I've also seen this problem if the JobSandbox table has too many rows to >>>>> process. I ran into a similar problem when I tried to run 10,000 Async >>>>> batch processes. The time it took for the JobPoller to process all the >>>>> records was too long and the transaction would time out. >>>>> >>>>> I had a patch to change the transaction timeout for the JobPoller >>>>> specifically as it wasn't available in ofbiz at the time, but I don't >>>> think >>>>> I ever submitted it. I could look for this patch if anyone is interested >>>>> but it may already be implemented in the framework. >>>>> >>>>> I would try archiving jobs from the JobSandbox first. >>>>> >>>>> >>>>> Brett >>>>> >>>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson >>>>> <[hidden email]>wrote: >>>>> >>>>>> Hello Everyone, >>>>>> >>>>>> I have an ofbiz instance in production where none of the jobs are >>>>>> being performed. I have about 160K jobs in pending status, but they >>>>>> are never being schedule. >>>>>> I can see the following in the log: >>>>>> >>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>>> JobManager.java:201:ERROR] ---- exception report >>>>>> ---------------------------------------------------------- Transaction >>>>>> error trying to commit when polling and updating the JobSandbox: >>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>>> error (with no rollbackOnly cause found), could not commit >>>>>> transaction, was rolled back instead: >>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>> timeout) Exception: >>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>>> back error (with no rollbackOnly cause found), could not commit >>>>>> transaction, was rolled back instead: >>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>> timeout) ---- cause >>>>>> --------------------------------------------------------------------- >>>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>>> timeout ---- stack trace >>>>>> --------------------------------------------------------------- >>>>>> javax.transaction.RollbackException: Transaction timeout >>>>>> >>>>>> >>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>>> >>>>>> >>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>>> >>>>>> >>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>>> >>>>>> >>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>>> java.lang.Thread.run(Thread.java:619) >>>>>> >>>>>> >>>> -------------------------------------------------------------------------------- >>>>>> >>>>>> I believe that the JobManager is not being able to handle all those >>>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>>> make the job list longer. >>>>>> >>>>>> Can anyone think of how to make the jobs run? >>>>>> >>>>>> All help much appreciated, >>>>>> >>>>>> -- >>>>>> Josh. >>>>>> >>>>> >>>> >>> > > |
As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems.
I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock. So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll. Regards Scott On 14/07/2011, at 1:02 PM, Josh Jacobson wrote: > Scott, > > Thanks! That is very precise advise. Do you have a suggestion on > interval time? 60 seconds? 120? > > Thanks, > > On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote: >> That configuration is for the frequency of job polls. There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly: >> JobManager.java (line 148): >> beganTransaction = TransactionUtil.begin(); >> needs to be changed to use TransactionUtil.begin(int) >> >> Regards >> Scott >> >> HotWax Media >> http://www.hotwaxmedia.com >> >> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: >> >>> Brett, >>> >>> Before I start trying to run the jobs manually, I want to give your >>> suggestion a try. I think I know where to configure the job polling >>> transaction time (I believe it's the poll-db-millis="20000" value on >>> the framework/service/config/serviceengine.xml. >>> >>> However, I still don't know what to increase it to. I understand that >>> we wouldn't want to make it bigger than the default polling interval. >>> Do you know what the default interval between polling is? >>> >>> Thanks, >>> >>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >>>> I meant removing finished jobs. If you have thousands of pending jobs then >>>> you will have the same problem I mentioned in my first email. One >>>> resolution will be to increase the job poller transaction time. In the >>>> ofbiz version I was using there was not a way to configure the poller >>>> transaction time. It just used the default time. I had to create a patch >>>> to allow this to happen. >>>> >>>> In the patch you had to be careful to not increase the transaction time >>>> greater than the frequency of the job poller. Otherwise you get into a lock >>>> situation where one job poller is still running within a transaction and >>>> another poller starts. This didn't create a huge problem but the second job >>>> poller would usually lock and then time out. >>>> >>>> >>>> >>>> Brett >>>> >>>> >>>> >>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: >>>> >>>>> Brett, >>>>> >>>>> Can you please explain what you mean by archiving the current JobSandbox >>>>> first? >>>>> Do you mean somehow removing the current pending jobs, applying you >>>>> patch and the copying them back again? >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >>>>> wrote: >>>>>> Josh, >>>>>> >>>>>> I've also seen this problem if the JobSandbox table has too many rows to >>>>>> process. I ran into a similar problem when I tried to run 10,000 Async >>>>>> batch processes. The time it took for the JobPoller to process all the >>>>>> records was too long and the transaction would time out. >>>>>> >>>>>> I had a patch to change the transaction timeout for the JobPoller >>>>>> specifically as it wasn't available in ofbiz at the time, but I don't >>>>> think >>>>>> I ever submitted it. I could look for this patch if anyone is interested >>>>>> but it may already be implemented in the framework. >>>>>> >>>>>> I would try archiving jobs from the JobSandbox first. >>>>>> >>>>>> >>>>>> Brett >>>>>> >>>>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson >>>>>> <[hidden email]>wrote: >>>>>> >>>>>>> Hello Everyone, >>>>>>> >>>>>>> I have an ofbiz instance in production where none of the jobs are >>>>>>> being performed. I have about 160K jobs in pending status, but they >>>>>>> are never being schedule. >>>>>>> I can see the following in the log: >>>>>>> >>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [ >>>>>>> JobManager.java:201:ERROR] ---- exception report >>>>>>> ---------------------------------------------------------- Transaction >>>>>>> error trying to commit when polling and updating the JobSandbox: >>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back >>>>>>> error (with no rollbackOnly cause found), could not commit >>>>>>> transaction, was rolled back instead: >>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>> timeout) Exception: >>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll >>>>>>> back error (with no rollbackOnly cause found), could not commit >>>>>>> transaction, was rolled back instead: >>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction >>>>>>> timeout) ---- cause >>>>>>> --------------------------------------------------------------------- >>>>>>> Exception: javax.transaction.RollbackException Message: Transaction >>>>>>> timeout ---- stack trace >>>>>>> --------------------------------------------------------------- >>>>>>> javax.transaction.RollbackException: Transaction timeout >>>>>>> >>>>>>> >>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269) >>>>>>> >>>>>>> >>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245) >>>>>>> >>>>>>> >>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259) >>>>>>> >>>>>>> >>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245) >>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197) >>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90) >>>>>>> java.lang.Thread.run(Thread.java:619) >>>>>>> >>>>>>> >>>>> -------------------------------------------------------------------------------- >>>>>>> >>>>>>> I believe that the JobManager is not being able to handle all those >>>>>>> jobs to schedule them, so nothing is being scheduled, which of course >>>>>>> make the job list longer. >>>>>>> >>>>>>> Can anyone think of how to make the jobs run? >>>>>>> >>>>>>> All help much appreciated, >>>>>>> >>>>>>> -- >>>>>>> Josh. >>>>>>> >>>>>> >>>>> >>>> >> >> smime.p7s (3K) Download Attachment |
Thanks again. I actually meant a suggestion for the transaction
timeout. In any case I am grateful for your explanation. On Wednesday, July 13, 2011, Scott Gray <[hidden email]> wrote: > As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems. > > I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock. So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll. > > Regards > Scott > > On 14/07/2011, at 1:02 PM, Josh Jacobson wrote: > >> Scott, >> >> Thanks! That is very precise advise. Do you have a suggestion on >> interval time? 60 seconds? 120? >> >> Thanks, >> >> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote: >>> That configuration is for the frequency of job polls. There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly: >>> JobManager.java (line 148): >>> beganTransaction = TransactionUtil.begin(); >>> needs to be changed to use TransactionUtil.begin(int) >>> >>> Regards >>> Scott >>> >>> HotWax Media >>> http://www.hotwaxmedia.com >>> >>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: >>> >>>> Brett, >>>> >>>> Before I start trying to run the jobs manually, I want to give your >>>> suggestion a try. I think I know where to configure the job polling >>>> transaction time (I believe it's the poll-db-millis="20000" value on >>>> the framework/service/config/serviceengine.xml. >>>> >>>> However, I still don't know what to increase it to. I understand that >>>> we wouldn't want to make it bigger than the default polling interval. >>>> Do you know what the default interval between polling is? >>>> >>>> Thanks, >>>> >>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >>>>> I meant removing finished jobs. If you have thousands of pending jobs then >>>>> you will have the same problem I mentioned in my first email. One >>>>> resolution will be to increase the job poller transaction time. In the >>>>> ofbiz version I was using there was not a way to configure the poller >>>>> transaction time. It just used the default time. I had to create a patch >>>>> to allow this to happen. >>>>> >>>>> In the patch you had to be careful to not increase the transaction time >>>>> greater than the frequency of the job poller. Otherwise you get into a lock >>>>> situation where one job poller is still running within a transaction and >>>>> another poller starts. This didn't create a huge problem but the second job >>>>> poller would usually lock and then time out. >>>>> >>>>> >>>>> >>>>> Brett >>>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: >>>>> >>>>>> Brett, >>>>>> >>>>>> Can you please explain what you mean by archiving the current JobSandbox >>>>>> first? >>>>>> Do you mean somehow removing the current pending jobs, applying you >>>>>> patch and the copying them back again? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >>>>>> wrote: >>>>>>> Josh, >>>>>>> >>>>>>> I've also seen this problem if the JobSandbox table has too many rows to >>>>>>> process. I ran into a similar problem when I tried to run 10,000 Async >>>>>>> batch processes. The time it took for the JobPoller to process all the >>>>>>> records was too long and the transaction would time out. >>>>>>> >>>>>>> I had a patch to change the transaction timeout for the JobPoller >>>>>>> specifically as it wasn't available in ofbiz at the time, but I don't >>>>>> think >>>>>>> I ever submitted it. I could look for this patch if anyone is interested >>>>>>> but it may already be implemented in the framework. >>>>>>> >>>>>>> I |
Ah okay, that is entirely dependent on the number of jobs and the speed the server can process them. As a side note I would keep a close eye on the purgeOldJobs service, when it starts falling over (transaction timeout again) then the number of rows in the table will increase quickly which in turn will slow down polling.
In general the whole persisted jobs implementation is a bit fragile, especially when dealing with a large number of jobs. I've wanted to replace it with something like quartz for a while but haven't had the time. Regards Scott On 14/07/2011, at 2:10 PM, Josh Jacobson wrote: > Thanks again. I actually meant a suggestion for the transaction > timeout. In any case I am grateful for your explanation. > > > On Wednesday, July 13, 2011, Scott Gray <[hidden email]> wrote: >> As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems. >> >> I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock. So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll. >> >> Regards >> Scott >> >> On 14/07/2011, at 1:02 PM, Josh Jacobson wrote: >> >>> Scott, >>> >>> Thanks! That is very precise advise. Do you have a suggestion on >>> interval time? 60 seconds? 120? >>> >>> Thanks, >>> >>> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote: >>>> That configuration is for the frequency of job polls. There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly: >>>> JobManager.java (line 148): >>>> beganTransaction = TransactionUtil.begin(); >>>> needs to be changed to use TransactionUtil.begin(int) >>>> >>>> Regards >>>> Scott >>>> >>>> HotWax Media >>>> http://www.hotwaxmedia.com >>>> >>>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: >>>> >>>>> Brett, >>>>> >>>>> Before I start trying to run the jobs manually, I want to give your >>>>> suggestion a try. I think I know where to configure the job polling >>>>> transaction time (I believe it's the poll-db-millis="20000" value on >>>>> the framework/service/config/serviceengine.xml. >>>>> >>>>> However, I still don't know what to increase it to. I understand that >>>>> we wouldn't want to make it bigger than the default polling interval. >>>>> Do you know what the default interval between polling is? >>>>> >>>>> Thanks, >>>>> >>>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote: >>>>>> I meant removing finished jobs. If you have thousands of pending jobs then >>>>>> you will have the same problem I mentioned in my first email. One >>>>>> resolution will be to increase the job poller transaction time. In the >>>>>> ofbiz version I was using there was not a way to configure the poller >>>>>> transaction time. It just used the default time. I had to create a patch >>>>>> to allow this to happen. >>>>>> >>>>>> In the patch you had to be careful to not increase the transaction time >>>>>> greater than the frequency of the job poller. Otherwise you get into a lock >>>>>> situation where one job poller is still running within a transaction and >>>>>> another poller starts. This didn't create a huge problem but the second job >>>>>> poller would usually lock and then time out. >>>>>> >>>>>> >>>>>> >>>>>> Brett >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote: >>>>>> >>>>>>> Brett, >>>>>>> >>>>>>> Can you please explain what you mean by archiving the current JobSandbox >>>>>>> first? >>>>>>> Do you mean somehow removing the current pending jobs, applying you >>>>>>> patch and the copying them back again? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> >>>>>>> wrote: >>>>>>>> Josh, >>>>>>>> >>>>>>>> I've also seen this problem if the JobSandbox table has too many rows to >>>>>>>> process. I ran into a similar problem when I tried to run 10,000 Async >>>>>>>> batch processes. The time it took for the JobPoller to process all the >>>>>>>> records was too long and the transaction would time out. >>>>>>>> >>>>>>>> I had a patch to change the transaction timeout for the JobPoller >>>>>>>> specifically as it wasn't available in ofbiz at the time, but I don't >>>>>>> think >>>>>>>> I ever submitted it. I could look for this patch if anyone is interested >>>>>>>> but it may already be implemented in the framework. >>>>>>>> >>>>>>>> I smime.p7s (3K) Download Attachment |
Free forum by Nabble | Edit this page |