JobManager failing to schedule jobs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

JobManager failing to schedule jobs

Josh Jacobson
Hello Everyone,

I have an ofbiz instance in production where none of the jobs are
being performed. I have about 160K jobs in pending status, but they
are never being schedule.
I can see the following in the log:

2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
JobManager.java:201:ERROR] ---- exception report
---------------------------------------------------------- Transaction
error trying to commit when polling and updating the JobSandbox:
org.ofbiz.entity.transaction.GenericTransactionException: Roll back
error (with no rollbackOnly cause found), could not commit
transaction, was rolled back instead:
javax.transaction.RollbackException: Transaction timeout (Transaction
timeout) Exception:
org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
back error (with no rollbackOnly cause found), could not commit
transaction, was rolled back instead:
javax.transaction.RollbackException: Transaction timeout (Transaction
timeout) ---- cause
---------------------------------------------------------------------
Exception: javax.transaction.RollbackException Message: Transaction
timeout ---- stack trace
---------------------------------------------------------------
javax.transaction.RollbackException: Transaction timeout
org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
java.lang.Thread.run(Thread.java:619)
--------------------------------------------------------------------------------

I believe that the JobManager is not being able to handle all those
jobs to schedule them, so nothing is being scheduled, which of course
make the job list longer.

Can anyone think of how to make the jobs run?

All help much appreciated,

--
Josh.
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

BJ Freeman
the key is  Transaction timeout
this could be the job length
could be the database connection

please specify the version of ofbiz since earlier transaction problems
were taken care of by changing code that deals with transactions.

Josh Jacobson sent the following on 7/13/2011 11:48 AM:

> Hello Everyone,
>
> I have an ofbiz instance in production where none of the jobs are
> being performed. I have about 160K jobs in pending status, but they
> are never being schedule.
> I can see the following in the log:
>
> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
> JobManager.java:201:ERROR] ---- exception report
> ---------------------------------------------------------- Transaction
> error trying to commit when polling and updating the JobSandbox:
> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
> error (with no rollbackOnly cause found), could not commit
> transaction, was rolled back instead:
> javax.transaction.RollbackException: Transaction timeout (Transaction
> timeout) Exception:
> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
> back error (with no rollbackOnly cause found), could not commit
> transaction, was rolled back instead:
> javax.transaction.RollbackException: Transaction timeout (Transaction
> timeout) ---- cause
> ---------------------------------------------------------------------
> Exception: javax.transaction.RollbackException Message: Transaction
> timeout ---- stack trace
> ---------------------------------------------------------------
> javax.transaction.RollbackException: Transaction timeout
> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
> java.lang.Thread.run(Thread.java:619)
> --------------------------------------------------------------------------------
>
> I believe that the JobManager is not being able to handle all those
> jobs to schedule them, so nothing is being scheduled, which of course
> make the job list longer.
>
> Can anyone think of how to make the jobs run?
>
> All help much appreciated,
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Brett
In reply to this post by Josh Jacobson
Josh,

I've also seen this problem if the JobSandbox table has too many rows to
process.  I ran into a similar problem when I tried to run 10,000 Async
batch processes.  The time it took for the JobPoller to process all the
records was too long and the transaction would time out.

I had a patch to change the transaction timeout for the JobPoller
specifically as it wasn't available in ofbiz at the time, but I don't think
I ever submitted it.  I could look for this patch if anyone is interested
but it may already be implemented in the framework.

I would try archiving jobs from the JobSandbox first.


Brett

On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
<[hidden email]>wrote:

> Hello Everyone,
>
> I have an ofbiz instance in production where none of the jobs are
> being performed. I have about 160K jobs in pending status, but they
> are never being schedule.
> I can see the following in the log:
>
> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
> JobManager.java:201:ERROR] ---- exception report
> ---------------------------------------------------------- Transaction
> error trying to commit when polling and updating the JobSandbox:
> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
> error (with no rollbackOnly cause found), could not commit
> transaction, was rolled back instead:
> javax.transaction.RollbackException: Transaction timeout (Transaction
> timeout) Exception:
> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
> back error (with no rollbackOnly cause found), could not commit
> transaction, was rolled back instead:
> javax.transaction.RollbackException: Transaction timeout (Transaction
> timeout) ---- cause
> ---------------------------------------------------------------------
> Exception: javax.transaction.RollbackException Message: Transaction
> timeout ---- stack trace
> ---------------------------------------------------------------
> javax.transaction.RollbackException: Transaction timeout
>
> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>
> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
> java.lang.Thread.run(Thread.java:619)
>
> --------------------------------------------------------------------------------
>
> I believe that the JobManager is not being able to handle all those
> jobs to schedule them, so nothing is being scheduled, which of course
> make the job list longer.
>
> Can anyone think of how to make the jobs run?
>
> All help much appreciated,
>
> --
> Josh.
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
In reply to this post by BJ Freeman
BJ,

I am running 10.04.

On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:

> the key is  Transaction timeout
> this could be the job length
> could be the database connection
>
> please specify the version of ofbiz since earlier transaction problems
> were taken care of by changing code that deals with transactions.
>
> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>> Hello Everyone,
>>
>> I have an ofbiz instance in production where none of the jobs are
>> being performed. I have about 160K jobs in pending status, but they
>> are never being schedule.
>> I can see the following in the log:
>>
>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>> JobManager.java:201:ERROR] ---- exception report
>> ---------------------------------------------------------- Transaction
>> error trying to commit when polling and updating the JobSandbox:
>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>> error (with no rollbackOnly cause found), could not commit
>> transaction, was rolled back instead:
>> javax.transaction.RollbackException: Transaction timeout (Transaction
>> timeout) Exception:
>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>> back error (with no rollbackOnly cause found), could not commit
>> transaction, was rolled back instead:
>> javax.transaction.RollbackException: Transaction timeout (Transaction
>> timeout) ---- cause
>> ---------------------------------------------------------------------
>> Exception: javax.transaction.RollbackException Message: Transaction
>> timeout ---- stack trace
>> ---------------------------------------------------------------
>> javax.transaction.RollbackException: Transaction timeout
>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>> java.lang.Thread.run(Thread.java:619)
>> --------------------------------------------------------------------------------
>>
>> I believe that the JobManager is not being able to handle all those
>> jobs to schedule them, so nothing is being scheduled, which of course
>> make the job list longer.
>>
>> Can anyone think of how to make the jobs run?
>>
>> All help much appreciated,
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
In reply to this post by Brett
Brett,

Can you please explain what you mean by archiving the current JobSandbox first?
Do you mean somehow removing the current pending jobs, applying you
patch and the copying them back again?

Thanks,


On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]> wrote:

> Josh,
>
> I've also seen this problem if the JobSandbox table has too many rows to
> process.  I ran into a similar problem when I tried to run 10,000 Async
> batch processes.  The time it took for the JobPoller to process all the
> records was too long and the transaction would time out.
>
> I had a patch to change the transaction timeout for the JobPoller
> specifically as it wasn't available in ofbiz at the time, but I don't think
> I ever submitted it.  I could look for this patch if anyone is interested
> but it may already be implemented in the framework.
>
> I would try archiving jobs from the JobSandbox first.
>
>
> Brett
>
> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
> <[hidden email]>wrote:
>
>> Hello Everyone,
>>
>> I have an ofbiz instance in production where none of the jobs are
>> being performed. I have about 160K jobs in pending status, but they
>> are never being schedule.
>> I can see the following in the log:
>>
>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>> JobManager.java:201:ERROR] ---- exception report
>> ---------------------------------------------------------- Transaction
>> error trying to commit when polling and updating the JobSandbox:
>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>> error (with no rollbackOnly cause found), could not commit
>> transaction, was rolled back instead:
>> javax.transaction.RollbackException: Transaction timeout (Transaction
>> timeout) Exception:
>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>> back error (with no rollbackOnly cause found), could not commit
>> transaction, was rolled back instead:
>> javax.transaction.RollbackException: Transaction timeout (Transaction
>> timeout) ---- cause
>> ---------------------------------------------------------------------
>> Exception: javax.transaction.RollbackException Message: Transaction
>> timeout ---- stack trace
>> ---------------------------------------------------------------
>> javax.transaction.RollbackException: Transaction timeout
>>
>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>
>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>> java.lang.Thread.run(Thread.java:619)
>>
>> --------------------------------------------------------------------------------
>>
>> I believe that the JobManager is not being able to handle all those
>> jobs to schedule them, so nothing is being scheduled, which of course
>> make the job list longer.
>>
>> Can anyone think of how to make the jobs run?
>>
>> All help much appreciated,
>>
>> --
>> Josh.
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

BJ Freeman
In reply to this post by Josh Jacobson
Ok so you have the latest code.
what is the eviorment you working with.
OS
Memory
CPU speed

Josh Jacobson sent the following on 7/13/2011 12:12 PM:

> BJ,
>
> I am running 10.04.
>
> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>> the key is  Transaction timeout
>> this could be the job length
>> could be the database connection
>>
>> please specify the version of ofbiz since earlier transaction problems
>> were taken care of by changing code that deals with transactions.
>>
>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>> Hello Everyone,
>>>
>>> I have an ofbiz instance in production where none of the jobs are
>>> being performed. I have about 160K jobs in pending status, but they
>>> are never being schedule.
>>> I can see the following in the log:
>>>
>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>> JobManager.java:201:ERROR] ---- exception report
>>> ---------------------------------------------------------- Transaction
>>> error trying to commit when polling and updating the JobSandbox:
>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>> error (with no rollbackOnly cause found), could not commit
>>> transaction, was rolled back instead:
>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>> timeout) Exception:
>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>> back error (with no rollbackOnly cause found), could not commit
>>> transaction, was rolled back instead:
>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>> timeout) ---- cause
>>> ---------------------------------------------------------------------
>>> Exception: javax.transaction.RollbackException Message: Transaction
>>> timeout ---- stack trace
>>> ---------------------------------------------------------------
>>> javax.transaction.RollbackException: Transaction timeout
>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>> java.lang.Thread.run(Thread.java:619)
>>> --------------------------------------------------------------------------------
>>>
>>> I believe that the JobManager is not being able to handle all those
>>> jobs to schedule them, so nothing is being scheduled, which of course
>>> make the job list longer.
>>>
>>> Can anyone think of how to make the jobs run?
>>>
>>> All help much appreciated,
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Brett
In reply to this post by Josh Jacobson
I meant removing finished jobs.  If you have thousands of pending jobs then
you will have the same problem I mentioned in my first email.  One
resolution will be to increase the job poller transaction time.  In the
ofbiz version I was using there was not a way to configure the poller
transaction time.  It just used the default time.  I had to create a patch
to allow this to happen.

In the patch you had to be careful to not increase the transaction time
greater than the frequency of the job poller.  Otherwise you get into a lock
situation where one job poller is still running within a transaction and
another poller starts.  This didn't create a huge problem but the second job
poller would usually lock and then time out.



Brett



On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:

> Brett,
>
> Can you please explain what you mean by archiving the current JobSandbox
> first?
> Do you mean somehow removing the current pending jobs, applying you
> patch and the copying them back again?
>
> Thanks,
>
>
> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
> wrote:
> > Josh,
> >
> > I've also seen this problem if the JobSandbox table has too many rows to
> > process.  I ran into a similar problem when I tried to run 10,000 Async
> > batch processes.  The time it took for the JobPoller to process all the
> > records was too long and the transaction would time out.
> >
> > I had a patch to change the transaction timeout for the JobPoller
> > specifically as it wasn't available in ofbiz at the time, but I don't
> think
> > I ever submitted it.  I could look for this patch if anyone is interested
> > but it may already be implemented in the framework.
> >
> > I would try archiving jobs from the JobSandbox first.
> >
> >
> > Brett
> >
> > On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
> > <[hidden email]>wrote:
> >
> >> Hello Everyone,
> >>
> >> I have an ofbiz instance in production where none of the jobs are
> >> being performed. I have about 160K jobs in pending status, but they
> >> are never being schedule.
> >> I can see the following in the log:
> >>
> >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
> >> JobManager.java:201:ERROR] ---- exception report
> >> ---------------------------------------------------------- Transaction
> >> error trying to commit when polling and updating the JobSandbox:
> >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
> >> error (with no rollbackOnly cause found), could not commit
> >> transaction, was rolled back instead:
> >> javax.transaction.RollbackException: Transaction timeout (Transaction
> >> timeout) Exception:
> >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
> >> back error (with no rollbackOnly cause found), could not commit
> >> transaction, was rolled back instead:
> >> javax.transaction.RollbackException: Transaction timeout (Transaction
> >> timeout) ---- cause
> >> ---------------------------------------------------------------------
> >> Exception: javax.transaction.RollbackException Message: Transaction
> >> timeout ---- stack trace
> >> ---------------------------------------------------------------
> >> javax.transaction.RollbackException: Transaction timeout
> >>
> >>
> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
> >>
> >>
> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
> >>
> >>
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
> >>
> >>
> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
> >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
> >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
> >> java.lang.Thread.run(Thread.java:619)
> >>
> >>
> --------------------------------------------------------------------------------
> >>
> >> I believe that the JobManager is not being able to handle all those
> >> jobs to schedule them, so nothing is being scheduled, which of course
> >> make the job list longer.
> >>
> >> Can anyone think of how to make the jobs run?
> >>
> >> All help much appreciated,
> >>
> >> --
> >> Josh.
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
In reply to this post by BJ Freeman
Currently I am running:

Red Hat Enterprise Linux Server release 5.5
6 CPUs, 16384MB RAM

It was very recently upgraded from 2 CPUs and 8GB of RAM because we
were having performance issues (lots of swap memory being used). It's
on one of those cloud servers. Now it's running without using any
swap.

On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote:

> Ok so you have the latest code.
> what is the eviorment you working with.
> OS
> Memory
> CPU speed
>
> Josh Jacobson sent the following on 7/13/2011 12:12 PM:
>> BJ,
>>
>> I am running 10.04.
>>
>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>>> the key is  Transaction timeout
>>> this could be the job length
>>> could be the database connection
>>>
>>> please specify the version of ofbiz since earlier transaction problems
>>> were taken care of by changing code that deals with transactions.
>>>
>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>>> Hello Everyone,
>>>>
>>>> I have an ofbiz instance in production where none of the jobs are
>>>> being performed. I have about 160K jobs in pending status, but they
>>>> are never being schedule.
>>>> I can see the following in the log:
>>>>
>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>> JobManager.java:201:ERROR] ---- exception report
>>>> ---------------------------------------------------------- Transaction
>>>> error trying to commit when polling and updating the JobSandbox:
>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>> error (with no rollbackOnly cause found), could not commit
>>>> transaction, was rolled back instead:
>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>> timeout) Exception:
>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>> back error (with no rollbackOnly cause found), could not commit
>>>> transaction, was rolled back instead:
>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>> timeout) ---- cause
>>>> ---------------------------------------------------------------------
>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>> timeout ---- stack trace
>>>> ---------------------------------------------------------------
>>>> javax.transaction.RollbackException: Transaction timeout
>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>> java.lang.Thread.run(Thread.java:619)
>>>> --------------------------------------------------------------------------------
>>>>
>>>> I believe that the JobManager is not being able to handle all those
>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>> make the job list longer.
>>>>
>>>> Can anyone think of how to make the jobs run?
>>>>
>>>> All help much appreciated,
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
In reply to this post by Brett
On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
> I meant removing finished jobs.  If you have thousands of pending jobs then
> you will have the same problem I mentioned in my first email.  One
> resolution will be to increase the job poller transaction time.  In the
> ofbiz version I was using there was not a way to configure the poller
> transaction time.  It just used the default time.  I had to create a patch
> to allow this to happen.

I see. I already did that: We had 2.6 million lines on the JobSandbox,
mostly of completed or failed jobs. We deleted completed and failed
and are now looking at about 260L pending jobs. I want to run those
jobs, so I can get the machine back to normal.

>
> In the patch you had to be careful to not increase the transaction time
> greater than the frequency of the job poller.  Otherwise you get into a lock
> situation where one job poller is still running within a transaction and
> another poller starts.  This didn't create a huge problem but the second job
> poller would usually lock and then time out.

I understand the possible race condition. So how do I figure what to
set the timeout to and where do I configure that?

Thanks,

--
Josh.
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
On Wed, Jul 13, 2011 at 12:51 PM, Josh Jacobson
<[hidden email]> wrote:

> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>> I meant removing finished jobs.  If you have thousands of pending jobs then
>> you will have the same problem I mentioned in my first email.  One
>> resolution will be to increase the job poller transaction time.  In the
>> ofbiz version I was using there was not a way to configure the poller
>> transaction time.  It just used the default time.  I had to create a patch
>> to allow this to happen.
>
> I see. I already did that: We had 2.6 million lines on the JobSandbox,
> mostly of completed or failed jobs. We deleted completed and failed
> and are now looking at about 260L pending jobs. I want to run those
> jobs, so I can get the machine back to normal.

Sorry, just noticed the typo: We currently have 260K + jobs as pending
and I want to process them to get things back to normal.

Thanks for the help,

--
Josh.
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

BJ Freeman
In reply to this post by Josh Jacobson
You now know why I don't recommend cloud configuration for realtime
operations, unless your running over dedicate lines not part of the
internet.
to summarize you environment caused the problem not ofbiz
Now you have jobs cued that should have been run but have piled up.
you need a way to get the job run so they don;t time out the system.
I recommend you look at the purge old jobs service, copy and modify it
to run your jobs, maybe by time group.

Josh Jacobson sent the following on 7/13/2011 12:48 PM:

> Currently I am running:
>
> Red Hat Enterprise Linux Server release 5.5
> 6 CPUs, 16384MB RAM
>
> It was very recently upgraded from 2 CPUs and 8GB of RAM because we
> were having performance issues (lots of swap memory being used). It's
> on one of those cloud servers. Now it's running without using any
> swap.
>
> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote:
>> Ok so you have the latest code.
>> what is the eviorment you working with.
>> OS
>> Memory
>> CPU speed
>>
>> Josh Jacobson sent the following on 7/13/2011 12:12 PM:
>>> BJ,
>>>
>>> I am running 10.04.
>>>
>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>>>> the key is  Transaction timeout
>>>> this could be the job length
>>>> could be the database connection
>>>>
>>>> please specify the version of ofbiz since earlier transaction problems
>>>> were taken care of by changing code that deals with transactions.
>>>>
>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>>>> Hello Everyone,
>>>>>
>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>> are never being schedule.
>>>>> I can see the following in the log:
>>>>>
>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>> ---------------------------------------------------------- Transaction
>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>> error (with no rollbackOnly cause found), could not commit
>>>>> transaction, was rolled back instead:
>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>> timeout) Exception:
>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>> transaction, was rolled back instead:
>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>> timeout) ---- cause
>>>>> ---------------------------------------------------------------------
>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>> timeout ---- stack trace
>>>>> ---------------------------------------------------------------
>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>> java.lang.Thread.run(Thread.java:619)
>>>>> --------------------------------------------------------------------------------
>>>>>
>>>>> I believe that the JobManager is not being able to handle all those
>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>> make the job list longer.
>>>>>
>>>>> Can anyone think of how to make the jobs run?
>>>>>
>>>>> All help much appreciated,
>>>>>
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
Thanks for the pointers. I'll take a look.

There is one more piece of information: The purgeOldJobs service is in
a "crashed" status. Do you think that is significant?

Thanks,

On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote:

> You now know why I don't recommend cloud configuration for realtime
> operations, unless your running over dedicate lines not part of the
> internet.
> to summarize you environment caused the problem not ofbiz
> Now you have jobs cued that should have been run but have piled up.
> you need a way to get the job run so they don;t time out the system.
> I recommend you look at the purge old jobs service, copy and modify it
> to run your jobs, maybe by time group.
>
> Josh Jacobson sent the following on 7/13/2011 12:48 PM:
>> Currently I am running:
>>
>> Red Hat Enterprise Linux Server release 5.5
>> 6 CPUs, 16384MB RAM
>>
>> It was very recently upgraded from 2 CPUs and 8GB of RAM because we
>> were having performance issues (lots of swap memory being used). It's
>> on one of those cloud servers. Now it's running without using any
>> swap.
>>
>> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote:
>>> Ok so you have the latest code.
>>> what is the eviorment you working with.
>>> OS
>>> Memory
>>> CPU speed
>>>
>>> Josh Jacobson sent the following on 7/13/2011 12:12 PM:
>>>> BJ,
>>>>
>>>> I am running 10.04.
>>>>
>>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>>>>> the key is  Transaction timeout
>>>>> this could be the job length
>>>>> could be the database connection
>>>>>
>>>>> please specify the version of ofbiz since earlier transaction problems
>>>>> were taken care of by changing code that deals with transactions.
>>>>>
>>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>>>>> Hello Everyone,
>>>>>>
>>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>>> are never being schedule.
>>>>>> I can see the following in the log:
>>>>>>
>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>>> ---------------------------------------------------------- Transaction
>>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>>> error (with no rollbackOnly cause found), could not commit
>>>>>> transaction, was rolled back instead:
>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>> timeout) Exception:
>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>>> transaction, was rolled back instead:
>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>> timeout) ---- cause
>>>>>> ---------------------------------------------------------------------
>>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>>> timeout ---- stack trace
>>>>>> ---------------------------------------------------------------
>>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>> --------------------------------------------------------------------------------
>>>>>>
>>>>>> I believe that the JobManager is not being able to handle all those
>>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>>> make the job list longer.
>>>>>>
>>>>>> Can anyone think of how to make the jobs run?
>>>>>>
>>>>>> All help much appreciated,
>>>>>>
>>>>>
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

BJ Freeman
it means it will not purge job done so you will get a build up
you can do a run service to start it again


Josh Jacobson sent the following on 7/13/2011 4:41 PM:

> Thanks for the pointers. I'll take a look.
>
> There is one more piece of information: The purgeOldJobs service is in
> a "crashed" status. Do you think that is significant?
>
> Thanks,
>
> On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote:
>> You now know why I don't recommend cloud configuration for realtime
>> operations, unless your running over dedicate lines not part of the
>> internet.
>> to summarize you environment caused the problem not ofbiz
>> Now you have jobs cued that should have been run but have piled up.
>> you need a way to get the job run so they don;t time out the system.
>> I recommend you look at the purge old jobs service, copy and modify it
>> to run your jobs, maybe by time group.
>>
>> Josh Jacobson sent the following on 7/13/2011 12:48 PM:
>>> Currently I am running:
>>>
>>> Red Hat Enterprise Linux Server release 5.5
>>> 6 CPUs, 16384MB RAM
>>>
>>> It was very recently upgraded from 2 CPUs and 8GB of RAM because we
>>> were having performance issues (lots of swap memory being used). It's
>>> on one of those cloud servers. Now it's running without using any
>>> swap.
>>>
>>> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote:
>>>> Ok so you have the latest code.
>>>> what is the eviorment you working with.
>>>> OS
>>>> Memory
>>>> CPU speed
>>>>
>>>> Josh Jacobson sent the following on 7/13/2011 12:12 PM:
>>>>> BJ,
>>>>>
>>>>> I am running 10.04.
>>>>>
>>>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>>>>>> the key is  Transaction timeout
>>>>>> this could be the job length
>>>>>> could be the database connection
>>>>>>
>>>>>> please specify the version of ofbiz since earlier transaction problems
>>>>>> were taken care of by changing code that deals with transactions.
>>>>>>
>>>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>>>>>> Hello Everyone,
>>>>>>>
>>>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>>>> are never being schedule.
>>>>>>> I can see the following in the log:
>>>>>>>
>>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>>>> ---------------------------------------------------------- Transaction
>>>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>>>> error (with no rollbackOnly cause found), could not commit
>>>>>>> transaction, was rolled back instead:
>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>> timeout) Exception:
>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>>>> transaction, was rolled back instead:
>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>> timeout) ---- cause
>>>>>>> ---------------------------------------------------------------------
>>>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>>>> timeout ---- stack trace
>>>>>>> ---------------------------------------------------------------
>>>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>>> --------------------------------------------------------------------------------
>>>>>>>
>>>>>>> I believe that the JobManager is not being able to handle all those
>>>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>>>> make the job list longer.
>>>>>>>
>>>>>>> Can anyone think of how to make the jobs run?
>>>>>>>
>>>>>>> All help much appreciated,
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
Thanks, that is what I figured. First things first though: I need to
get those jobs running somehow.

Thanks for the help.

On Wed, Jul 13, 2011 at 4:46 PM, BJ Freeman <[hidden email]> wrote:

> it means it will not purge job done so you will get a build up
> you can do a run service to start it again
>
>
> Josh Jacobson sent the following on 7/13/2011 4:41 PM:
>> Thanks for the pointers. I'll take a look.
>>
>> There is one more piece of information: The purgeOldJobs service is in
>> a "crashed" status. Do you think that is significant?
>>
>> Thanks,
>>
>> On Wed, Jul 13, 2011 at 4:32 PM, BJ Freeman <[hidden email]> wrote:
>>> You now know why I don't recommend cloud configuration for realtime
>>> operations, unless your running over dedicate lines not part of the
>>> internet.
>>> to summarize you environment caused the problem not ofbiz
>>> Now you have jobs cued that should have been run but have piled up.
>>> you need a way to get the job run so they don;t time out the system.
>>> I recommend you look at the purge old jobs service, copy and modify it
>>> to run your jobs, maybe by time group.
>>>
>>> Josh Jacobson sent the following on 7/13/2011 12:48 PM:
>>>> Currently I am running:
>>>>
>>>> Red Hat Enterprise Linux Server release 5.5
>>>> 6 CPUs, 16384MB RAM
>>>>
>>>> It was very recently upgraded from 2 CPUs and 8GB of RAM because we
>>>> were having performance issues (lots of swap memory being used). It's
>>>> on one of those cloud servers. Now it's running without using any
>>>> swap.
>>>>
>>>> On Wed, Jul 13, 2011 at 12:22 PM, BJ Freeman <[hidden email]> wrote:
>>>>> Ok so you have the latest code.
>>>>> what is the eviorment you working with.
>>>>> OS
>>>>> Memory
>>>>> CPU speed
>>>>>
>>>>> Josh Jacobson sent the following on 7/13/2011 12:12 PM:
>>>>>> BJ,
>>>>>>
>>>>>> I am running 10.04.
>>>>>>
>>>>>> On Wed, Jul 13, 2011 at 12:00 PM, BJ Freeman <[hidden email]> wrote:
>>>>>>> the key is  Transaction timeout
>>>>>>> this could be the job length
>>>>>>> could be the database connection
>>>>>>>
>>>>>>> please specify the version of ofbiz since earlier transaction problems
>>>>>>> were taken care of by changing code that deals with transactions.
>>>>>>>
>>>>>>> Josh Jacobson sent the following on 7/13/2011 11:48 AM:
>>>>>>>> Hello Everyone,
>>>>>>>>
>>>>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>>>>> are never being schedule.
>>>>>>>> I can see the following in the log:
>>>>>>>>
>>>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>>>>> ---------------------------------------------------------- Transaction
>>>>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>>>>> error (with no rollbackOnly cause found), could not commit
>>>>>>>> transaction, was rolled back instead:
>>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>>> timeout) Exception:
>>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>>>>> transaction, was rolled back instead:
>>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>>> timeout) ---- cause
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>>>>> timeout ---- stack trace
>>>>>>>> ---------------------------------------------------------------
>>>>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>> I believe that the JobManager is not being able to handle all those
>>>>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>>>>> make the job list longer.
>>>>>>>>
>>>>>>>> Can anyone think of how to make the jobs run?
>>>>>>>>
>>>>>>>> All help much appreciated,
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
In reply to this post by Brett
Brett,

Before I start trying to run the jobs manually, I want to give your
suggestion a try. I think I know where to configure the job polling
transaction time (I believe it's the poll-db-millis="20000" value on
the framework/service/config/serviceengine.xml.

However, I still don't know what to increase it to. I understand that
we wouldn't want to make it bigger than the default polling interval.
Do you know what the default interval between polling is?

Thanks,

On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:

> I meant removing finished jobs.  If you have thousands of pending jobs then
> you will have the same problem I mentioned in my first email.  One
> resolution will be to increase the job poller transaction time.  In the
> ofbiz version I was using there was not a way to configure the poller
> transaction time.  It just used the default time.  I had to create a patch
> to allow this to happen.
>
> In the patch you had to be careful to not increase the transaction time
> greater than the frequency of the job poller.  Otherwise you get into a lock
> situation where one job poller is still running within a transaction and
> another poller starts.  This didn't create a huge problem but the second job
> poller would usually lock and then time out.
>
>
>
> Brett
>
>
>
> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>
>> Brett,
>>
>> Can you please explain what you mean by archiving the current JobSandbox
>> first?
>> Do you mean somehow removing the current pending jobs, applying you
>> patch and the copying them back again?
>>
>> Thanks,
>>
>>
>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>> wrote:
>> > Josh,
>> >
>> > I've also seen this problem if the JobSandbox table has too many rows to
>> > process.  I ran into a similar problem when I tried to run 10,000 Async
>> > batch processes.  The time it took for the JobPoller to process all the
>> > records was too long and the transaction would time out.
>> >
>> > I had a patch to change the transaction timeout for the JobPoller
>> > specifically as it wasn't available in ofbiz at the time, but I don't
>> think
>> > I ever submitted it.  I could look for this patch if anyone is interested
>> > but it may already be implemented in the framework.
>> >
>> > I would try archiving jobs from the JobSandbox first.
>> >
>> >
>> > Brett
>> >
>> > On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
>> > <[hidden email]>wrote:
>> >
>> >> Hello Everyone,
>> >>
>> >> I have an ofbiz instance in production where none of the jobs are
>> >> being performed. I have about 160K jobs in pending status, but they
>> >> are never being schedule.
>> >> I can see the following in the log:
>> >>
>> >> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>> >> JobManager.java:201:ERROR] ---- exception report
>> >> ---------------------------------------------------------- Transaction
>> >> error trying to commit when polling and updating the JobSandbox:
>> >> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>> >> error (with no rollbackOnly cause found), could not commit
>> >> transaction, was rolled back instead:
>> >> javax.transaction.RollbackException: Transaction timeout (Transaction
>> >> timeout) Exception:
>> >> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>> >> back error (with no rollbackOnly cause found), could not commit
>> >> transaction, was rolled back instead:
>> >> javax.transaction.RollbackException: Transaction timeout (Transaction
>> >> timeout) ---- cause
>> >> ---------------------------------------------------------------------
>> >> Exception: javax.transaction.RollbackException Message: Transaction
>> >> timeout ---- stack trace
>> >> ---------------------------------------------------------------
>> >> javax.transaction.RollbackException: Transaction timeout
>> >>
>> >>
>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>> >>
>> >>
>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>> >>
>> >>
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>> >>
>> >>
>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>> >> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>> >> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>> >> java.lang.Thread.run(Thread.java:619)
>> >>
>> >>
>> --------------------------------------------------------------------------------
>> >>
>> >> I believe that the JobManager is not being able to handle all those
>> >> jobs to schedule them, so nothing is being scheduled, which of course
>> >> make the job list longer.
>> >>
>> >> Can anyone think of how to make the jobs run?
>> >>
>> >> All help much appreciated,
>> >>
>> >> --
>> >> Josh.
>> >>
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Scott Gray-2
That configuration is for the frequency of job polls.  There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
JobManager.java (line 148):
beganTransaction = TransactionUtil.begin();
needs to be changed to use TransactionUtil.begin(int)

Regards
Scott

HotWax Media
http://www.hotwaxmedia.com

On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:

> Brett,
>
> Before I start trying to run the jobs manually, I want to give your
> suggestion a try. I think I know where to configure the job polling
> transaction time (I believe it's the poll-db-millis="20000" value on
> the framework/service/config/serviceengine.xml.
>
> However, I still don't know what to increase it to. I understand that
> we wouldn't want to make it bigger than the default polling interval.
> Do you know what the default interval between polling is?
>
> Thanks,
>
> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>> I meant removing finished jobs.  If you have thousands of pending jobs then
>> you will have the same problem I mentioned in my first email.  One
>> resolution will be to increase the job poller transaction time.  In the
>> ofbiz version I was using there was not a way to configure the poller
>> transaction time.  It just used the default time.  I had to create a patch
>> to allow this to happen.
>>
>> In the patch you had to be careful to not increase the transaction time
>> greater than the frequency of the job poller.  Otherwise you get into a lock
>> situation where one job poller is still running within a transaction and
>> another poller starts.  This didn't create a huge problem but the second job
>> poller would usually lock and then time out.
>>
>>
>>
>> Brett
>>
>>
>>
>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>>
>>> Brett,
>>>
>>> Can you please explain what you mean by archiving the current JobSandbox
>>> first?
>>> Do you mean somehow removing the current pending jobs, applying you
>>> patch and the copying them back again?
>>>
>>> Thanks,
>>>
>>>
>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>>> wrote:
>>>> Josh,
>>>>
>>>> I've also seen this problem if the JobSandbox table has too many rows to
>>>> process.  I ran into a similar problem when I tried to run 10,000 Async
>>>> batch processes.  The time it took for the JobPoller to process all the
>>>> records was too long and the transaction would time out.
>>>>
>>>> I had a patch to change the transaction timeout for the JobPoller
>>>> specifically as it wasn't available in ofbiz at the time, but I don't
>>> think
>>>> I ever submitted it.  I could look for this patch if anyone is interested
>>>> but it may already be implemented in the framework.
>>>>
>>>> I would try archiving jobs from the JobSandbox first.
>>>>
>>>>
>>>> Brett
>>>>
>>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
>>>> <[hidden email]>wrote:
>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>> are never being schedule.
>>>>> I can see the following in the log:
>>>>>
>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>> ---------------------------------------------------------- Transaction
>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>> error (with no rollbackOnly cause found), could not commit
>>>>> transaction, was rolled back instead:
>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>> timeout) Exception:
>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>> transaction, was rolled back instead:
>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>> timeout) ---- cause
>>>>> ---------------------------------------------------------------------
>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>> timeout ---- stack trace
>>>>> ---------------------------------------------------------------
>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>
>>>>>
>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>
>>>>>
>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>
>>>>>
>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>
>>>>>
>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>
>>>>>
>>> --------------------------------------------------------------------------------
>>>>>
>>>>> I believe that the JobManager is not being able to handle all those
>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>> make the job list longer.
>>>>>
>>>>> Can anyone think of how to make the jobs run?
>>>>>
>>>>> All help much appreciated,
>>>>>
>>>>> --
>>>>> Josh.
>>>>>
>>>>
>>>
>>


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
Scott,

Thanks! That is very precise advise. Do you have a suggestion on
interval time? 60 seconds? 120?

Thanks,

On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote:

> That configuration is for the frequency of job polls.  There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
> JobManager.java (line 148):
> beganTransaction = TransactionUtil.begin();
> needs to be changed to use TransactionUtil.begin(int)
>
> Regards
> Scott
>
> HotWax Media
> http://www.hotwaxmedia.com
>
> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:
>
>> Brett,
>>
>> Before I start trying to run the jobs manually, I want to give your
>> suggestion a try. I think I know where to configure the job polling
>> transaction time (I believe it's the poll-db-millis="20000" value on
>> the framework/service/config/serviceengine.xml.
>>
>> However, I still don't know what to increase it to. I understand that
>> we wouldn't want to make it bigger than the default polling interval.
>> Do you know what the default interval between polling is?
>>
>> Thanks,
>>
>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>>> I meant removing finished jobs.  If you have thousands of pending jobs then
>>> you will have the same problem I mentioned in my first email.  One
>>> resolution will be to increase the job poller transaction time.  In the
>>> ofbiz version I was using there was not a way to configure the poller
>>> transaction time.  It just used the default time.  I had to create a patch
>>> to allow this to happen.
>>>
>>> In the patch you had to be careful to not increase the transaction time
>>> greater than the frequency of the job poller.  Otherwise you get into a lock
>>> situation where one job poller is still running within a transaction and
>>> another poller starts.  This didn't create a huge problem but the second job
>>> poller would usually lock and then time out.
>>>
>>>
>>>
>>> Brett
>>>
>>>
>>>
>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>>>
>>>> Brett,
>>>>
>>>> Can you please explain what you mean by archiving the current JobSandbox
>>>> first?
>>>> Do you mean somehow removing the current pending jobs, applying you
>>>> patch and the copying them back again?
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>>>> wrote:
>>>>> Josh,
>>>>>
>>>>> I've also seen this problem if the JobSandbox table has too many rows to
>>>>> process.  I ran into a similar problem when I tried to run 10,000 Async
>>>>> batch processes.  The time it took for the JobPoller to process all the
>>>>> records was too long and the transaction would time out.
>>>>>
>>>>> I had a patch to change the transaction timeout for the JobPoller
>>>>> specifically as it wasn't available in ofbiz at the time, but I don't
>>>> think
>>>>> I ever submitted it.  I could look for this patch if anyone is interested
>>>>> but it may already be implemented in the framework.
>>>>>
>>>>> I would try archiving jobs from the JobSandbox first.
>>>>>
>>>>>
>>>>> Brett
>>>>>
>>>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
>>>>> <[hidden email]>wrote:
>>>>>
>>>>>> Hello Everyone,
>>>>>>
>>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>>> are never being schedule.
>>>>>> I can see the following in the log:
>>>>>>
>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>>> ---------------------------------------------------------- Transaction
>>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>>> error (with no rollbackOnly cause found), could not commit
>>>>>> transaction, was rolled back instead:
>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>> timeout) Exception:
>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>>> transaction, was rolled back instead:
>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>> timeout) ---- cause
>>>>>> ---------------------------------------------------------------------
>>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>>> timeout ---- stack trace
>>>>>> ---------------------------------------------------------------
>>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>>
>>>>>>
>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>>
>>>>>>
>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>>
>>>>>>
>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>>
>>>>>>
>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>>
>>>>>>
>>>> --------------------------------------------------------------------------------
>>>>>>
>>>>>> I believe that the JobManager is not being able to handle all those
>>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>>> make the job list longer.
>>>>>>
>>>>>> Can anyone think of how to make the jobs run?
>>>>>>
>>>>>> All help much appreciated,
>>>>>>
>>>>>> --
>>>>>> Josh.
>>>>>>
>>>>>
>>>>
>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Scott Gray-2
As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems.

I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock.  So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll.

Regards
Scott

On 14/07/2011, at 1:02 PM, Josh Jacobson wrote:

> Scott,
>
> Thanks! That is very precise advise. Do you have a suggestion on
> interval time? 60 seconds? 120?
>
> Thanks,
>
> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote:
>> That configuration is for the frequency of job polls.  There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
>> JobManager.java (line 148):
>> beganTransaction = TransactionUtil.begin();
>> needs to be changed to use TransactionUtil.begin(int)
>>
>> Regards
>> Scott
>>
>> HotWax Media
>> http://www.hotwaxmedia.com
>>
>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:
>>
>>> Brett,
>>>
>>> Before I start trying to run the jobs manually, I want to give your
>>> suggestion a try. I think I know where to configure the job polling
>>> transaction time (I believe it's the poll-db-millis="20000" value on
>>> the framework/service/config/serviceengine.xml.
>>>
>>> However, I still don't know what to increase it to. I understand that
>>> we wouldn't want to make it bigger than the default polling interval.
>>> Do you know what the default interval between polling is?
>>>
>>> Thanks,
>>>
>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>>>> I meant removing finished jobs.  If you have thousands of pending jobs then
>>>> you will have the same problem I mentioned in my first email.  One
>>>> resolution will be to increase the job poller transaction time.  In the
>>>> ofbiz version I was using there was not a way to configure the poller
>>>> transaction time.  It just used the default time.  I had to create a patch
>>>> to allow this to happen.
>>>>
>>>> In the patch you had to be careful to not increase the transaction time
>>>> greater than the frequency of the job poller.  Otherwise you get into a lock
>>>> situation where one job poller is still running within a transaction and
>>>> another poller starts.  This didn't create a huge problem but the second job
>>>> poller would usually lock and then time out.
>>>>
>>>>
>>>>
>>>> Brett
>>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>>>>
>>>>> Brett,
>>>>>
>>>>> Can you please explain what you mean by archiving the current JobSandbox
>>>>> first?
>>>>> Do you mean somehow removing the current pending jobs, applying you
>>>>> patch and the copying them back again?
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>>>>> wrote:
>>>>>> Josh,
>>>>>>
>>>>>> I've also seen this problem if the JobSandbox table has too many rows to
>>>>>> process.  I ran into a similar problem when I tried to run 10,000 Async
>>>>>> batch processes.  The time it took for the JobPoller to process all the
>>>>>> records was too long and the transaction would time out.
>>>>>>
>>>>>> I had a patch to change the transaction timeout for the JobPoller
>>>>>> specifically as it wasn't available in ofbiz at the time, but I don't
>>>>> think
>>>>>> I ever submitted it.  I could look for this patch if anyone is interested
>>>>>> but it may already be implemented in the framework.
>>>>>>
>>>>>> I would try archiving jobs from the JobSandbox first.
>>>>>>
>>>>>>
>>>>>> Brett
>>>>>>
>>>>>> On Wed, Jul 13, 2011 at 12:48 PM, Josh Jacobson
>>>>>> <[hidden email]>wrote:
>>>>>>
>>>>>>> Hello Everyone,
>>>>>>>
>>>>>>> I have an ofbiz instance in production where none of the jobs are
>>>>>>> being performed. I have about 160K jobs in pending status, but they
>>>>>>> are never being schedule.
>>>>>>> I can see the following in the log:
>>>>>>>
>>>>>>> 2011-07-13 13:32:01,959 (org.ofbiz.service.job.JobPoller@2599930b) [
>>>>>>> JobManager.java:201:ERROR] ---- exception report
>>>>>>> ---------------------------------------------------------- Transaction
>>>>>>> error trying to commit when polling and updating the JobSandbox:
>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException: Roll back
>>>>>>> error (with no rollbackOnly cause found), could not commit
>>>>>>> transaction, was rolled back instead:
>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>> timeout) Exception:
>>>>>>> org.ofbiz.entity.transaction.GenericTransactionException Message: Roll
>>>>>>> back error (with no rollbackOnly cause found), could not commit
>>>>>>> transaction, was rolled back instead:
>>>>>>> javax.transaction.RollbackException: Transaction timeout (Transaction
>>>>>>> timeout) ---- cause
>>>>>>> ---------------------------------------------------------------------
>>>>>>> Exception: javax.transaction.RollbackException Message: Transaction
>>>>>>> timeout ---- stack trace
>>>>>>> ---------------------------------------------------------------
>>>>>>> javax.transaction.RollbackException: Transaction timeout
>>>>>>>
>>>>>>>
>>>>> org.apache.geronimo.transaction.manager.TransactionImpl.commit(TransactionImpl.java:269)
>>>>>>>
>>>>>>>
>>>>> org.apache.geronimo.transaction.manager.TransactionManagerImpl.commit(TransactionManagerImpl.java:245)
>>>>>>>
>>>>>>>
>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:259)
>>>>>>>
>>>>>>>
>>>>> org.ofbiz.entity.transaction.TransactionUtil.commit(TransactionUtil.java:245)
>>>>>>> org.ofbiz.service.job.JobManager.poll(JobManager.java:197)
>>>>>>> org.ofbiz.service.job.JobPoller.run(JobPoller.java:90)
>>>>>>> java.lang.Thread.run(Thread.java:619)
>>>>>>>
>>>>>>>
>>>>> --------------------------------------------------------------------------------
>>>>>>>
>>>>>>> I believe that the JobManager is not being able to handle all those
>>>>>>> jobs to schedule them, so nothing is being scheduled, which of course
>>>>>>> make the job list longer.
>>>>>>>
>>>>>>> Can anyone think of how to make the jobs run?
>>>>>>>
>>>>>>> All help much appreciated,
>>>>>>>
>>>>>>> --
>>>>>>> Josh.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Josh Jacobson
Thanks again. I actually meant a suggestion for the transaction
timeout. In any case I am grateful for your explanation.


On Wednesday, July 13, 2011, Scott Gray <[hidden email]> wrote:

> As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems.
>
> I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock.  So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll.
>
> Regards
> Scott
>
> On 14/07/2011, at 1:02 PM, Josh Jacobson wrote:
>
>> Scott,
>>
>> Thanks! That is very precise advise. Do you have a suggestion on
>> interval time? 60 seconds? 120?
>>
>> Thanks,
>>
>> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote:
>>> That configuration is for the frequency of job polls.  There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
>>> JobManager.java (line 148):
>>> beganTransaction = TransactionUtil.begin();
>>> needs to be changed to use TransactionUtil.begin(int)
>>>
>>> Regards
>>> Scott
>>>
>>> HotWax Media
>>> http://www.hotwaxmedia.com
>>>
>>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:
>>>
>>>> Brett,
>>>>
>>>> Before I start trying to run the jobs manually, I want to give your
>>>> suggestion a try. I think I know where to configure the job polling
>>>> transaction time (I believe it's the poll-db-millis="20000" value on
>>>> the framework/service/config/serviceengine.xml.
>>>>
>>>> However, I still don't know what to increase it to. I understand that
>>>> we wouldn't want to make it bigger than the default polling interval.
>>>> Do you know what the default interval between polling is?
>>>>
>>>> Thanks,
>>>>
>>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>>>>> I meant removing finished jobs.  If you have thousands of pending jobs then
>>>>> you will have the same problem I mentioned in my first email.  One
>>>>> resolution will be to increase the job poller transaction time.  In the
>>>>> ofbiz version I was using there was not a way to configure the poller
>>>>> transaction time.  It just used the default time.  I had to create a patch
>>>>> to allow this to happen.
>>>>>
>>>>> In the patch you had to be careful to not increase the transaction time
>>>>> greater than the frequency of the job poller.  Otherwise you get into a lock
>>>>> situation where one job poller is still running within a transaction and
>>>>> another poller starts.  This didn't create a huge problem but the second job
>>>>> poller would usually lock and then time out.
>>>>>
>>>>>
>>>>>
>>>>> Brett
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>>>>>
>>>>>> Brett,
>>>>>>
>>>>>> Can you please explain what you mean by archiving the current JobSandbox
>>>>>> first?
>>>>>> Do you mean somehow removing the current pending jobs, applying you
>>>>>> patch and the copying them back again?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>>>>>> wrote:
>>>>>>> Josh,
>>>>>>>
>>>>>>> I've also seen this problem if the JobSandbox table has too many rows to
>>>>>>> process.  I ran into a similar problem when I tried to run 10,000 Async
>>>>>>> batch processes.  The time it took for the JobPoller to process all the
>>>>>>> records was too long and the transaction would time out.
>>>>>>>
>>>>>>> I had a patch to change the transaction timeout for the JobPoller
>>>>>>> specifically as it wasn't available in ofbiz at the time, but I don't
>>>>>> think
>>>>>>> I ever submitted it.  I could look for this patch if anyone is interested
>>>>>>> but it may already be implemented in the framework.
>>>>>>>
>>>>>>> I
Reply | Threaded
Open this post in threaded view
|

Re: JobManager failing to schedule jobs

Scott Gray-2
Ah okay, that is entirely dependent on the number of jobs and the speed the server can process them.  As a side note I would keep a close eye on the purgeOldJobs service, when it starts falling over (transaction timeout again) then the number of rows in the table will increase quickly which in turn will slow down polling.

In general the whole persisted jobs implementation is a bit fragile, especially when dealing with a large number of jobs.  I've wanted to replace it with something like quartz for a while but haven't had the time.

Regards
Scott

On 14/07/2011, at 2:10 PM, Josh Jacobson wrote:

> Thanks again. I actually meant a suggestion for the transaction
> timeout. In any case I am grateful for your explanation.
>
>
> On Wednesday, July 13, 2011, Scott Gray <[hidden email]> wrote:
>> As best I can tell there shouldn't be any need to increase the interval between polls since the interval timer doesn't actually start until the previous poll has completed (see JobPoller.run()) so I can't see how a small interval would cause any backlog problems.
>>
>> I'm guessing if there is any lock contention then it's probably caused by the executing jobs trying to update their respective rows while the poller is holding a table lock.  So from that point of view I guess increasing the interval could reduce the amount of contention between the executing jobs and the next poll.
>>
>> Regards
>> Scott
>>
>> On 14/07/2011, at 1:02 PM, Josh Jacobson wrote:
>>
>>> Scott,
>>>
>>> Thanks! That is very precise advise. Do you have a suggestion on
>>> interval time? 60 seconds? 120?
>>>
>>> Thanks,
>>>
>>> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray <[hidden email]> wrote:
>>>> That configuration is for the frequency of job polls.  There isn't any ability to specify the transaction timeout via configuration so you'll need to modify the code directly:
>>>> JobManager.java (line 148):
>>>> beganTransaction = TransactionUtil.begin();
>>>> needs to be changed to use TransactionUtil.begin(int)
>>>>
>>>> Regards
>>>> Scott
>>>>
>>>> HotWax Media
>>>> http://www.hotwaxmedia.com
>>>>
>>>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:
>>>>
>>>>> Brett,
>>>>>
>>>>> Before I start trying to run the jobs manually, I want to give your
>>>>> suggestion a try. I think I know where to configure the job polling
>>>>> transaction time (I believe it's the poll-db-millis="20000" value on
>>>>> the framework/service/config/serviceengine.xml.
>>>>>
>>>>> However, I still don't know what to increase it to. I understand that
>>>>> we wouldn't want to make it bigger than the default polling interval.
>>>>> Do you know what the default interval between polling is?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer <[hidden email]> wrote:
>>>>>> I meant removing finished jobs.  If you have thousands of pending jobs then
>>>>>> you will have the same problem I mentioned in my first email.  One
>>>>>> resolution will be to increase the job poller transaction time.  In the
>>>>>> ofbiz version I was using there was not a way to configure the poller
>>>>>> transaction time.  It just used the default time.  I had to create a patch
>>>>>> to allow this to happen.
>>>>>>
>>>>>> In the patch you had to be careful to not increase the transaction time
>>>>>> greater than the frequency of the job poller.  Otherwise you get into a lock
>>>>>> situation where one job poller is still running within a transaction and
>>>>>> another poller starts.  This didn't create a huge problem but the second job
>>>>>> poller would usually lock and then time out.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Brett
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson <[hidden email]>wrote:
>>>>>>
>>>>>>> Brett,
>>>>>>>
>>>>>>> Can you please explain what you mean by archiving the current JobSandbox
>>>>>>> first?
>>>>>>> Do you mean somehow removing the current pending jobs, applying you
>>>>>>> patch and the copying them back again?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 13, 2011 at 12:08 PM, Brett Palmer <[hidden email]>
>>>>>>> wrote:
>>>>>>>> Josh,
>>>>>>>>
>>>>>>>> I've also seen this problem if the JobSandbox table has too many rows to
>>>>>>>> process.  I ran into a similar problem when I tried to run 10,000 Async
>>>>>>>> batch processes.  The time it took for the JobPoller to process all the
>>>>>>>> records was too long and the transaction would time out.
>>>>>>>>
>>>>>>>> I had a patch to change the transaction timeout for the JobPoller
>>>>>>>> specifically as it wasn't available in ofbiz at the time, but I don't
>>>>>>> think
>>>>>>>> I ever submitted it.  I could look for this patch if anyone is interested
>>>>>>>> but it may already be implemented in the framework.
>>>>>>>>
>>>>>>>> I


smime.p7s (3K) Download Attachment
12