OFBiz - Re: EntityListIterator.getResultsSizeAfterPartialList() vs. delegator.getCountByCondition(...)

OFBiz › OFBiz - Dev

Re: EntityListIterator.getResultsSizeAfterPartialList() vs. delegator.getCountByCondition(...)

Posted by Scott Gray-2 on Aug 26, 2009; 12:14pm
URL: http://ofbiz.116.s1.nabble.com/EntityListIterator-getResultsSizeAfterPartialList-vs-delegator-getCountByCondition-tp206959p206971.html

So after some testing with mysql and postgres things are a little
different. Both drivers seem to incur a time penalty depending on the
result set size when the result set is opened whereas derby is fast
until you attempt to go to the end of the result set.

I don't really see any point in improving things for derby when we
don't really intend for it to handle large resultsets anyway.

All I can think to do for mysql and postgres is to limit the max rows
for the iterator query and then do a separate count query to get the
total for pagination. Any thoughts?

Thanks
Scott

On 26/08/2009, at 5:13 PM, Scott Gray wrote:

> Hi David,
>
> I did the testing with Derby and in deployment we're seeing the
> issue on MySql but I haven't tested on it as yet and am assuming
> it's the same issue. I will test it on postgresql shortly out of
> curiosity.
>
> Anyway thanks for your comments, I'll have a look at the ELI and see
> what can be done.
>
> Regards
> Scott
>
> On 26/08/2009, at 4:34 PM, David E Jones wrote:
>
>>
>> Which database are you using? Ie, was it only with Derby? I think
>> this topic came up a short while back for another database... but I
>> don't remember if it was ever resolved.
>>
>> I faintly recall writing before something along these lines: We may
>> want to make this change within the EntityListIterator class itself
>> and add a method that gets the total count of the results, probably
>> by running the same query again with the condition from the
>> original query.
>>
>> Whatever the case, this may only help on certain databases (ie some
>> databases will have a better JDBC driver and cursor implementation
>> so that seeking to the end will not result in this sort of
>> inefficient behavior), but it probably worth doing in general
>> anyway, and perhaps have it configured on the datasource element in
>> the entityengine.xml file like various other database-dependent
>> settings.
>>
>> -David
>>
>>
>> On Aug 25, 2009, at 10:04 PM, Scott Gray wrote:
>>
>>> Hi all,
>>>
>>> We've had a few slow query problems lately and I've narrowed it
>>> down to the ResultSet.last() method call in EntityListIterator
>>> when performed on large result sets. I switched the
>>> FindGeneric.groovy script in webtools to use findCountByCondition
>>> instead of getResultSizeAfterPartialList and the page load time
>>> went from 40-50s down to 2-3s for a result containing 700,000
>>> records.
>>>
>>> Based on that I assumed there was probably some magic number
>>> depending on your system where it becomes more efficient to do a
>>> separate count query rather than use the ResultSet so I put
>>> together a quick test to find out. I threw together a script that
>>> repeatedly adds 500 rows to the jobsandbox and then outputs the
>>> average time taken of 3 attempts to get the list size for each
>>> method. Here's a graph of the results using embedded Derby: http://imgur.com/ieR7m
>>>
>>> So unless the magic number lies somewhere in the first 500 records
>>> it looks to me like it always more efficient to do a separate
>>> count query.
>>>
>>> It makes me wonder if we should be taking a different approach to
>>> pagination in the form widget and in general. Any thoughts?
>>>
>>> Thanks
>>> Scott
>>>
>>
>

smime.p7s (3K) Download Attachment