As the JobSandbox table grows, the time taken to query the database
for crashed jobs also grows and eventually causes timeouts meaning any crashed jobs are no longer loaded. The query looks like this: SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY START_DATE_TIME ASC which doesn't give the database much to work with and results in most of the table's contents being scanned. If I add statusId into the mix the query returns almost instantly. Does anyone see any potential problems if I remove the finishDateTime and cancelDateTime fields from the query and instead check for any of the following statuses: SERVICE_PENDING SERVICE_QUEUED SERVICE_RUNNING Thanks Scott HotWax Media http://www.hotwaxmedia.com smime.p7s (3K) Download Attachment |
That would make sense to do. I guess when the statudId field was added the query was never adjusted. On a side note, my guess is that the performance difference is due to indexed versus non-indexed fields, and that statudId is indexed automatically because the reverse-foreign- key indexing that the EE does. -David On Aug 29, 2009, at 6:37 AM, Scott Gray wrote: > As the JobSandbox table grows, the time taken to query the database > for crashed jobs also grows and eventually causes timeouts meaning > any crashed jobs are no longer loaded. > > The query looks like this: > SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND > CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY > START_DATE_TIME ASC > which doesn't give the database much to work with and results in > most of the table's contents being scanned. > > If I add statusId into the mix the query returns almost instantly. > Does anyone see any potential problems if I remove the > finishDateTime and cancelDateTime fields from the query and instead > check for any of the following statuses: > SERVICE_PENDING > SERVICE_QUEUED > SERVICE_RUNNING > > Thanks > Scott > > HotWax Media > http://www.hotwaxmedia.com |
Thanks David, I'll go ahead and make the change. My main doubt was
that aside from UI queries the statusId fields doesn't seem to be used for the actual processing of any jobs, reloadCrashedJobs, poll and purgeOldJobs all use the date fields only. The poller is fine because it's querying on runByInstanceId which is indexed but purgeOldJobs looks like it could potentially suffer from the same problem. Do you think it would be okay to switch everything over to querying on statusId? Thanks Scott On 30/08/2009, at 10:10 AM, David E Jones wrote: > > That would make sense to do. I guess when the statudId field was > added the query was never adjusted. On a side note, my guess is that > the performance difference is due to indexed versus non-indexed > fields, and that statudId is indexed automatically because the > reverse-foreign-key indexing that the EE does. > > -David > > > On Aug 29, 2009, at 6:37 AM, Scott Gray wrote: > >> As the JobSandbox table grows, the time taken to query the database >> for crashed jobs also grows and eventually causes timeouts meaning >> any crashed jobs are no longer loaded. >> >> The query looks like this: >> SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND >> CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY >> START_DATE_TIME ASC >> which doesn't give the database much to work with and results in >> most of the table's contents being scanned. >> >> If I add statusId into the mix the query returns almost instantly. >> Does anyone see any potential problems if I remove the >> finishDateTime and cancelDateTime fields from the query and instead >> check for any of the following statuses: >> SERVICE_PENDING >> SERVICE_QUEUED >> SERVICE_RUNNING >> >> Thanks >> Scott >> >> HotWax Media >> http://www.hotwaxmedia.com > smime.p7s (3K) Download Attachment |
Free forum by Nabble | Edit this page |